Lookup error nltk - Исправление ошибок и поиск оптимальных решений проблем

While running a Python script using NLTK I got this:

Traceback (most recent call last):
  File "cpicklesave.py", line 56, in <module>
    pos = nltk.pos_tag(words)
  File "/usr/lib/python2.7/site-packages/nltk/tag/__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "/usr/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 140, in __init__
    AP_MODEL_LOC = str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
  File "/usr/lib/python2.7/site-packages/nltk/data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource u'taggers/averaged_perceptron_tagger/averaged_perceptro
  n_tagger.pickle' not found.  Please use the NLTK Downloader to
  obtain the resource:  >>> nltk.download()
  Searched in:
    - '/root/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

Can anyone explain the problem?

erip

16k10 gold badges66 silver badges121 bronze badges

asked Mar 8, 2016 at 7:29

First answer said the missing module is ‘the Perceptron Tagger’, actually its name in nltk.download is ‘averaged_perceptron_tagger’

You can use this to fix the error

nltk.download('averaged_perceptron_tagger')

answered Aug 13, 2016 at 10:17

PosuerPosuer

4314 silver badges6 bronze badges

TL;DR

import nltk
nltk.download('averaged_perceptron_tagger')

Or to download all packages + data + docs:

import nltk
nltk.download('all')

See How do I download NLTK data?

answered Apr 7, 2017 at 2:01

alvasalvas

111k105 gold badges434 silver badges712 bronze badges

Install all nltk resources in one line:

python3 -c "import nltk; nltk.download('all')"

the data will be saved at ~/nltk_data

Install only specific resource:

Substitute «all» for «averaged_perceptron_tagger» to install only this module.

python3 -c "import nltk; nltk.download('averaged_perceptron_tagger')"

answered Mar 28, 2019 at 16:36

Problem:
Lookup error when extracting count vectorizer from scikit learn. Below is code snippet.

from sklearn.feature_extraction.text import CountVectorizer
bow_transformer = CountVectorizer(analyzer=text_process).fit(X)

Solution:
Try to run the below code and then try to install the stopwords from corpora natural language processing toolkit!!

import nltk
nltk.download()

answered Feb 20, 2018 at 13:03

You can download NLTK missing module just by

import nltk
nltk.download()

This will shows the NLTK download screen.
If it shows SSL Certificate verify failed error. Then it should works by disabling SSL check with below code!

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

answered Apr 25, 2019 at 6:04

ishwardgretishwardgret

1,0088 silver badges10 bronze badges

Sorry! if I missed other editor but this is working fine in Google Colab

import nltk
nltk.download('all')

answered Aug 25, 2022 at 13:00

akDakD

1,0771 gold badge10 silver badges15 bronze badges

Sometimes even by writing
nltk.download('module_name'), it does not get downloaded. At those times, you can open python in interactive mode and then download by using nltk.download('module_name').

answered Sep 13, 2019 at 3:05

You just need to download that module for nltk.
The better way is to open a python command line and type

import nltk
nltk.download('all')

That’s all.

answered Feb 6 at 10:11

If you have not downloaded ntlk then firstly download ntlk and then use this nltk.download('punkt') it will give you the result.

kk.

3,61812 gold badges34 silver badges66 bronze badges

answered Sep 9, 2020 at 13:37

import nltk


nltk.download('vader_lexicon')

Use this this might work

answered Feb 1, 2021 at 13:40

Источник

In iPython console I typed from nltk.book import and I got several LookupErrors. Below shows the code I got.

*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
<ipython-input-3-8446809acbd4> in <module>()
 ----> 1 from nltk.book import*

C:UsersdellAnacondalibsite-packagesnltk-3.0.3-py2.7.eggnltkbook.py in <module>()
 20 print("Type: 'texts()' or 'sents()' to list the materials.")
 21 
---> 22 text1 = Text(gutenberg.words('melville-moby_dick.txt'))
 23 print("text1:", text1.name)
 24 

 C:UsersdellAnacondalibsite-packagesnltk-3.0.3-py2.7.eggnltkcorpusutil.pyc in __getattr__(self, attr)
 97             raise AttributeError("LazyCorpusLoader object has no attribute '__bases__'")
 98 
 ---> 99         self.__load()
100         # This looks circular, but its not, since __load() changes our
101         # __class__ to something new:

 C:UsersdellAnacondalibsite-packagesnltk-3.0.3-py2.7.eggnltkcorpusutil.pyc in __load(self)
 62             except LookupError as e:
 63                 try: root = nltk.data.find('corpora/%s' % zip_name)
 ---> 64                 except LookupError: raise e
 65 
 66         # Load the corpus.

 LookupError: 
 **********************************************************************
 Resource u'corpora/gutenberg' not found.  Please use the NLTK
 Downloader to obtain the resource:  >>> nltk.download()
 Searched in:
- 'C:\Users\dell/nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'
- 'C:\Users\dell\Anaconda\nltk_data'
- 'C:\Users\dell\Anaconda\lib\nltk_data'
- 'C:\Users\dell\AppData\Roaming\nltk_data'
**********************************************************************

In [4]:

Can i know why I get these errors?

asked Jun 18, 2015 at 7:20

Your missing the Gutenberg corpora in nltk.book, hence the error.
The error is self descriptive.

You need to use nltk.download() to download the corpora.

Once the corpora is downloaded, re-run your command and check if the error comes up again. If it does, it would be for another corpora. Download that corpora too.

from nltk.book import * is not the preferred method, it is advisable to only import the corpora which you would be using in your code.
You could use from nltk.corpus import gutenberg instead.

See reference on link

answered Jun 18, 2015 at 8:01

As the NLTK book says, the way to prepare for working with the book is to open up the nltk.download() pop-up, turn to the tab «Collections», and download the «Book» collection. Do it and you can read the rest of the book with no surprises.

Incidentally you can do the same from the python console, without the pop-ups, by executing nltk.download("book")

answered Jun 18, 2015 at 22:24

alexisalexis

47.8k16 gold badges98 silver badges157 bronze badges

Seems it searches for the data only at specific places (like mentioned in the error description). Try copying the content of nltk into one of those directories (or create one) such as D:nltk_data
This solved the issue for me (because the error would continue to show up even if the Guttenber was already downloaded since it did not find it at that place)

An excerpt from the error you get: (these are the directories among which you can choose where to place the nltk content so that it can be found)

‘C:Usersdell/nltk_data’
‘C:nltk_data’
‘D:nltk_data’
‘E:nltk_data’
‘C:UsersdellAnacondanltk_data’
‘C:UsersdellAnacondalibnltk_data’
‘C:UsersdellAppDataRoamingnltk_data’

answered Feb 21, 2018 at 17:42

Maybe you should download the nltk_data package in the following directory:

Johnny Bones

8,7107 gold badges48 silver badges114 bronze badges

answered Nov 27, 2015 at 15:47

Источник

Я пытаюсь запустить проект анализа настроений, и я буду использовать метод стоп-слов. Я провел некоторое исследование, и я обнаружил, что nltk имеет временные слова, но когда я выполняю команду, возникает ошибка.

Что я делаю, так это следующее, чтобы узнать, какие слова используют nltk (например, что вы можете найти здесь http://www.nltk.org/book/ch02.html в разделе4.1)

from nltk.corpus import stopwords
stopwords.words('english')

Но когда я нажимаю enter, я получаю

---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-6-ff9cd17f22b2> in <module>()
----> 1 stopwords.words('english')

C:UsersUsuarioAnacondalibsite-packagesnltkcorpusutil.pyc in __getattr__(self, attr)
 66
 67     def __getattr__(self, attr):
---> 68         self.__load()
 69         # This looks circular, but its not, since __load() changes our
 70         # __class__ to something new:

C:UsersUsuarioAnacondalibsite-packagesnltkcorpusutil.pyc in __load(self)
 54             except LookupError, e:
 55                 try: root = nltk.data.find('corpora/%s' % zip_name)
---> 56                 except LookupError: raise e
 57
 58         # Load the corpus.

LookupError:
**********************************************************************
  Resource 'corpora/stopwords' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
- 'C:\Users\Meru/nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'
- 'C:\Users\Meru\Anaconda\nltk_data'
- 'C:\Users\Meru\Anaconda\lib\nltk_data'
- 'C:\Users\Meru\AppData\Roaming\nltk_data'
**********************************************************************

И из-за этой проблемы такие вещи не могут работать должным образом (получение той же ошибки):

>>> from nltk.corpus import stopwords
>>> stop = stopwords.words('english')
>>> sentence = "this is a foo bar sentence"
>>> print [i for i in sentence.split() if i not in stop]

Знаете ли вы, что может быть проблемой? Я должен использовать слова на испанском языке, рекомендуете ли вы другой метод? Я также подумал использовать пакет Goslate с наборами данных на английском языке

Спасибо за чтение!

P.D.: Я использую Ananconda

Источник

i m designing a web service including Topic modelling technique,so,i sue you code and i noticed that there is a problem with nltk .I m a beginer in natural language procesing and it is the first time that i met this problem.

In my terminal shell ,it occurs that:
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkcorpusutil.py», line 86, in __load
root = nltk.data.find(‘{}/{}’.format(self.subdir, zip_name))
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkdata.py», line 699, in find
raise LookupError(resource_not_found)
LookupError:

Resource �[93mstopwords�[0m not found.
Please use the NLTK Downloader to obtain the resource:

�[31m>>> import nltk

nltk.download(‘stopwords’)
�[0m
Attempted to load �[93mcorpora/stopwords.zip/stopwords/�[0m

Searched in:
— ‘C:UsersIBL/nltk_data’
— ‘C:UsersIBL.virtualenvspython36envnltk_data’
— ‘C:UsersIBL.virtualenvspython36envsharenltk_data’
— ‘C:UsersIBL.virtualenvspython36envlibnltk_data’
— ‘C:UsersIBLAppDataRoamingnltk_data’
— ‘C:nltk_data’
— ‘D:nltk_data’
— ‘E:nltk_data’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File «build_topic_model_browser.py», line 26, in
min_absolute_frequency=min_tf)
File «C:UsersIBLdesktopTopic modellingTOMtom_libstructurecorpus.py», line 41, in init
stopWords = set(stopwords.words(‘french’))
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkcorpusutil.py», line 123, in getattr
self.__load()
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkcorpusutil.py», line 88, in __load
raise e
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkcorpusutil.py», line 83, in __load
root = nltk.data.find(‘{}/{}’.format(self.subdir, self.__name))
File «C:UsersIBL.virtualenvspython36envlibsite-packagesnltk-3.4-py3.7-win32.eggnltkdata.py», line 699, in find
raise LookupError(resource_not_found)
LookupError:

Resource �[93mstopwords�[0m not found.
Please use the NLTK Downloader to obtain the resource:

�[31m>>> import nltk

nltk.download(‘stopwords’)
�[0m
Attempted to load �[93mcorpora/stopwords�[0m

i use an approach that includes to put in command line

import nltk
nltk.download(‘stopwords’)
But ,it doesn’t work ,please help me!!!

Источник

While running a Python script using NLTK I got this:

Traceback (most recent call last):
  File "cpicklesave.py", line 56, in <module>
    pos = nltk.pos_tag(words)
  File "/usr/lib/python2.7/site-packages/nltk/tag/__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "/usr/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 140, in __init__
    AP_MODEL_LOC = str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
  File "/usr/lib/python2.7/site-packages/nltk/data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource u'taggers/averaged_perceptron_tagger/averaged_perceptro
  n_tagger.pickle' not found.  Please use the NLTK Downloader to
  obtain the resource:  >>> nltk.download()
  Searched in:
    - '/root/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

Can anyone explain the problem?

9 Answers

First answer said the missing module is ‘the Perceptron Tagger’, actually its name in nltk.download is ‘averaged_perceptron_tagger’

You can use this to fix the error

nltk.download('averaged_perceptron_tagger')

TL;DR

import nltk
nltk.download('averaged_perceptron_tagger')

Or to download all packages + data + docs:

import nltk
nltk.download('all')

See How do I download NLTK data?

Install all nltk resources in one line:

python3 -c "import nltk; nltk.download('all')"

the data will be saved at ~/nltk_data

You can also substitute «all» for «averaged_perceptron_tagger» to install only this module.

You can download NLTK missing module just by

import nltk
nltk.download()

This will shows the NLTK download screen.
If it shows SSL Certificate verify failed error. Then it should works by disabling SSL check with below code!

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

Problem:
Lookup error when extracting count vectorizer from scikit learn. Below is code snippet.

from sklearn.feature_extraction.text import CountVectorizer
bow_transformer = CountVectorizer(analyzer=text_process).fit(X)

Solution:
Try to run the below code and then try to install the stopwords from corpora natural language processing toolkit!!

import nltk
nltk.download()

If you have not downloaded ntlk then firstly download ntlk and then use this nltk.download('punkt') it will give you the result.

import nltk


nltk.download('vader_lexicon')

Use this this might work

Источник

9 Answers

Читайте также: