There are different ways to install scikit-learn:
Install the latest official release. This
is the best approach for most users. It will provide a stable version
and pre-built packages are available for most platforms.Install the version of scikit-learn provided by your
operating system or Python distribution.
This is a quick option for those who have operating systems or Python
distributions that distribute scikit-learn.
It might not provide the latest release version.Building the package from source. This is best for users who want the
latest-and-greatest features and aren’t afraid of running
brand-new code. This is also needed for users who wish to contribute to the
project.
Installing the latest release¶
Operating System
Windows
macOS
Linux
Packager
pip
conda
Install the 64bit version of Python 3, for instance from https://www.python.org.Install Python 3 using homebrew (brew install python
) or by manually installing the package from https://www.python.org.Install python3 and python3-pip using the package manager of the Linux Distribution.Install conda using the Anaconda or miniconda
installers or the miniforge installers
(no administrator permission required for any of those).
Then run:
python3 -m venv sklearn-venvpython -m venv sklearn-venvpython -m venv sklearn-venvsource sklearn-venv/bin/activatesource sklearn-venv/bin/activatesklearn-venvScriptsactivatepip install -U scikit-learnpip install -U scikit-learnpip install -U scikit-learnpip3 install -U scikit-learnconda create -n sklearn-env -c conda-forge scikit-learnconda activate sklearn-env
In order to check your installation you can use
python3 -m pip show scikit-learn # to see which version and where scikit-learn is installedpython3 -m pip freeze # to see all packages installed in the active virtualenvpython3 -c "import sklearn; sklearn.show_versions()"python -m pip show scikit-learn # to see which version and where scikit-learn is installedpython -m pip freeze # to see all packages installed in the active virtualenvpython -c "import sklearn; sklearn.show_versions()"python -m pip show scikit-learn # to see which version and where scikit-learn is installedpython -m pip freeze # to see all packages installed in the active virtualenvpython -c "import sklearn; sklearn.show_versions()"python -m pip show scikit-learn # to see which version and where scikit-learn is installedpython -m pip freeze # to see all packages installed in the active virtualenvpython -c "import sklearn; sklearn.show_versions()"conda list scikit-learn # to see which scikit-learn version is installedconda list # to see all packages installed in the active conda environmentpython -c "import sklearn; sklearn.show_versions()"
Note that in order to avoid potential conflicts with other packages it is
strongly recommended to use a virtual environment (venv) or a conda environment.
Using such an isolated environment makes it possible to install a specific
version of scikit-learn with pip or conda and its dependencies independently of
any previously installed Python packages. In particular under Linux is it
discouraged to install pip packages alongside the packages managed by the
package manager of the distribution (apt, dnf, pacman…).
Note that you should always remember to activate the environment of your choice
prior to running any Python command whenever you start a new terminal session.
If you have not installed NumPy or SciPy yet, you can also install these using
conda or pip. When using pip, please ensure that binary wheels are used,
and NumPy and SciPy are not recompiled from source, which can happen when using
particular configurations of operating system and hardware (such as Linux on
a Raspberry Pi).
Scikit-learn plotting capabilities (i.e., functions start with “plot_”
and classes end with “Display”) require Matplotlib. The examples require
Matplotlib and some examples require scikit-image, pandas, or seaborn. The
minimum version of Scikit-learn dependencies are listed below along with its
purpose.
Dependency |
Minimum Version |
Purpose |
---|---|---|
numpy |
1.17.3 |
build, install |
scipy |
1.3.2 |
build, install |
joblib |
1.1.1 |
install |
threadpoolctl |
2.0.0 |
install |
cython |
0.29.24 |
build |
matplotlib |
3.1.3 |
benchmark, docs, examples, tests |
scikit-image |
0.16.2 |
docs, examples, tests |
pandas |
1.0.5 |
benchmark, docs, examples, tests |
seaborn |
0.9.0 |
docs, examples |
memory_profiler |
0.57.0 |
benchmark, docs |
pytest |
5.3.1 |
tests |
pytest-cov |
2.9.0 |
tests |
flake8 |
3.8.2 |
tests |
black |
22.3.0 |
tests |
mypy |
0.961 |
tests |
pyamg |
4.0.0 |
tests |
sphinx |
4.0.1 |
docs |
sphinx-gallery |
0.7.0 |
docs |
numpydoc |
1.2.0 |
docs, tests |
Pillow |
7.1.2 |
docs |
pooch |
1.6.0 |
docs, examples, tests |
sphinx-prompt |
1.3.0 |
docs |
sphinxext-opengraph |
0.4.2 |
docs |
plotly |
5.10.0 |
docs, examples |
conda-lock |
1.3.0 |
maintenance |
Warning
Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4.
Scikit-learn 0.21 supported Python 3.5-3.7.
Scikit-learn 0.22 supported Python 3.5-3.8.
Scikit-learn 0.23 — 0.24 require Python 3.6 or newer.
Scikit-learn 1.0 supported Python 3.7-3.10.
Scikit-learn 1.1 and later requires Python 3.8 or newer.
Note
For installing on PyPy, PyPy3-v5.10+, Numpy 1.14.0+, and scipy 1.1.0+
are required.
Installing on Apple Silicon M1 hardware¶
The recently introduced macos/arm64
platform (sometimes also known as
macos/aarch64
) requires the open source community to upgrade the build
configuration and automation to properly support it.
At the time of writing (January 2021), the only way to get a working
installation of scikit-learn on this hardware is to install scikit-learn and its
dependencies from the conda-forge distribution, for instance using the miniforge
installers:
https://github.com/conda-forge/miniforge
The following issue tracks progress on making it possible to install
scikit-learn from PyPI with pip:
https://github.com/scikit-learn/scikit-learn/issues/19137
Third party distributions of scikit-learn¶
Some third-party distributions provide versions of
scikit-learn integrated with their package-management systems.
These can make installation and upgrading much easier for users since
the integration includes the ability to automatically install
dependencies (numpy, scipy) that scikit-learn requires.
The following is an incomplete list of OS and python distributions
that provide their own version of scikit-learn.
Alpine Linux¶
Alpine Linux’s package is provided through the official repositories as
py3-scikit-learn
for Python.
It can be installed by typing the following command:
sudo apk add py3-scikit-learn
Arch Linux¶
Arch Linux’s package is provided through the official repositories as
python-scikit-learn
for Python.
It can be installed by typing the following command:
sudo pacman -S python-scikit-learn
Debian/Ubuntu¶
The Debian/Ubuntu package is split in three different packages called
python3-sklearn
(python modules), python3-sklearn-lib
(low-level
implementations and bindings), python3-sklearn-doc
(documentation).
Only the Python 3 version is available in the Debian Buster (the more recent
Debian distribution).
Packages can be installed using apt-get
:
sudo apt-get install python3-sklearn python3-sklearn-lib python3-sklearn-doc
Fedora¶
The Fedora package is called python3-scikit-learn
for the python 3 version,
the only one available in Fedora30.
It can be installed using dnf
:
sudo dnf install python3-scikit-learn
NetBSD¶
scikit-learn is available via pkgsrc-wip:
MacPorts for Mac OSX¶
The MacPorts package is named py<XY>-scikits-learn
,
where XY
denotes the Python version.
It can be installed by typing the following
command:
sudo port install py39-scikit-learn
Anaconda and Enthought Deployment Manager for all supported platforms¶
Anaconda and
Enthought Deployment Manager
both ship with scikit-learn in addition to a large set of scientific
python library for Windows, Mac OSX and Linux.
Anaconda offers scikit-learn as part of its free distribution.
Intel conda channel¶
Intel maintains a dedicated conda channel that ships scikit-learn:
conda install -c intel scikit-learn
This version of scikit-learn comes with alternative solvers for some common
estimators. Those solvers come from the DAAL C++ library and are optimized for
multi-core Intel CPUs.
Note that those solvers are not enabled by default, please refer to the
daal4py documentation
for more details.
Compatibility with the standard scikit-learn solvers is checked by running the
full scikit-learn test suite via automated continuous integration as reported
on https://github.com/IntelPython/daal4py.
WinPython for Windows¶
The WinPython project distributes
scikit-learn as an additional plugin.
Troubleshooting¶
Error caused by file path length limit on Windows¶
It can happen that pip fails to install packages when reaching the default path
size limit of Windows if Python is installed in a nested location such as the
AppData
folder structure under the user home directory, for instance:
C:Usersusername>C:UsersusernameAppDataLocalMicrosoftWindowsAppspython.exe -m pip install scikit-learn Collecting scikit-learn ... Installing collected packages: scikit-learn ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: 'C:\Users\username\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\sklearn\datasets\tests\data\openml\292\api-v1-json-data-list-data_name-australian-limit-2-data_version-1-status-deactivated.json.gz'
In this case it is possible to lift that limit in the Windows registry by
using the regedit
tool:
-
Type “regedit” in the Windows start menu to launch
regedit
. -
Go to the
ComputerHKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlFileSystem
key. -
Edit the value of the
LongPathsEnabled
property of that key and set
it to 1. -
Reinstall scikit-learn (ignoring the previous broken installation):
pip install --exists-action=i scikit-learn
Sklearn is an open source Python library that implements a range of machine learning, preprocessing, cross-validation and visualization algorithms. To analyze data with machine learning, sklearn is often used to approach. Although I already have experience installing sklearn library on Windows, this time I encountered problems installing on my new computer.
Problem 1. pip install sklearn/scipy failed
$ pip install sklearn
failed building wheel for scikit-learn
So I check the requirements on the sklearn official site and found that I didn’t install scipy before.
Scikit-learn requires:
Python (>= 2.6 or >= 3.3),
NumPy (>= 1.6.1),
SciPy (>= 0.9).
Since sklearn needs the dependency of scipy
, I need to install scipy before installing sklearn. However, teh same error occurs.
$ pip install sklearn
failed building wheel for scikit-learn
To solve this problem, I need to download the needed wheel manually and install it by the following command:
$ pip install <filename>.whl
- Python Wheels Download Sites
- Numpy
- Scipy
- Sklearn
Problem 2. filename.whl
is not a supported wheel on this platform
To check which version of wheel should be downloaded and installed, you can input the following python commands in shell:
>>> import pip;
>>> print(pip.pep425tags.get_supported())
[('cp36', 'cp36m', 'win32'), ('cp36', 'none', 'win32'), ('py3', 'none', 'win32'), ('cp36', 'none', 'any'), ('cp3', 'none', 'any'), ('py36', 'none', 'any'), ('py3', 'none', 'any'), ('py35', 'none', 'any'), ('py34', 'none', 'any'), ('py33', 'none', 'any'), ('py32', 'none', 'any'), ('py31', 'none', 'any'), ('py30', 'none', 'any')][('cp36', 'cp36m', 'win32'), ('cp36', 'none', 'win32'), ('py3', 'none', 'win32'), ('cp36', 'none', 'any'), ('cp3', 'none', 'any'), ('py36', 'none', 'any'), ('py3', 'none', 'any'), ('py35', 'none', 'any'), ('py34', 'none', 'any'), ('py33', 'none', 'any'), ('py32', 'none', 'any'), ('py31', 'none', 'any'), ('py30', 'none', 'any')][('cp36', 'cp36m', 'win32'), ('cp36', 'none', 'win32'), ('py3', 'none', 'win32'), ('cp36', 'none', 'any'), ('cp3', 'none', 'any'), ('py36', 'none', 'any'), ('py3', 'none', 'any'), ('py35', 'none', 'any'), ('py34', 'none', 'any'), ('py33', 'none', 'any'), ('py32', 'none', 'any'), ('py31', 'none', 'any'), ('py30', 'none', 'any')]
Make sure that every tag section(separated by ‘-‘) in your wheel file name is included in the supported tags.
Successfully Installed
After install numpy, scipy, and sklearn respectively from wheel, sklearn is successfully installed.
$ pip install numpy-1.12.1+mkl-cp36-cp36m-win32.whl
$ pip install scipy-0.19.0-cp36-cp36m-win32.whl
$ pip install scikit_learn-0.18.1-cp36-cp36m-win32.whl
Содержание
- No module named ‘sklearn’ error, though pip or pip3 both show sklearn is installed
- Installing scikit-learn¶
- Installing the latest release¶
- ModuleNotFoundError: No module named ‘sklearn’
- 15 Answers 15
- Cause
- Solution
- Verify
- How To Fix ModuleNotFoundError: No module named ‘sklearn’
- Understanding how to properly install and import scikit-learn in Python
- Introduction
- Installing packages with pip the right way
- Upgrade the package to the latest version
- Using Virtual Environments
- What to do if you are using anaconda
- What to do if you are working with Jupyter
- Still having troubles?
No module named ‘sklearn’ error, though pip or pip3 both show sklearn is installed
I’ve played a bit today with pyenv trying to install a certain python version as well as sklearn for data science use, but it appears I’ve broken it — I cannot import sklearn, though when I tried to install it using pip3/pip, I got messages showing that sklearn has already been installed.
The current situation At a jupyter notebook,
shows a «No module named ‘sklearn’ error». But when I try to install sklearn using
I’ve also tried to install sklearn outside jupyter notebook, and the messages are the same.
What I’ve done earlier today — My Mac (High Sierra) already has python 2.7, but I need python 3, so I first installed python3 by using
- I installed jupyter notebook
At jupyter notebook, I attempted to use
!pip3 install sklearn
to install sklearn, but got some errors, and by researching online, I’ve found out that it seems sklearn does not support the most recent python 3.7. — I uninstalled Python3 as well as Jupyter Notebook, before trying to get an older version of python — I tried to use brew to get an older version of python, but found out online that brew does not install a previous version easily, so I installed pyenv instead according to some online post (without actually understanding it very well) — in pyenv I installed python 2.6.5
I set 2.6.5 as the global python version
pyenv global 2.6.5
I installed jupyter notebook again (and maybe I also installed ipython at the same time), which seems to depend on python 3.7, so from the log python 3.7 is installed (which is not what I want)
pip3 install sklearn
it shows that sklearn is installed) — However, when I tried to import sklearn it shows Module not found.
Could any please point to a direction what could have gone wrong? The above list may not be very accurate as I may have repeatedly installed and uninstall things just to try out. But the more I tried, the more confused I get. I would really appreciate any help. Thank you!
Источник
Installing scikit-learn¶
There are different ways to install scikit-learn:
Install the latest official release . This is the best approach for most users. It will provide a stable version and pre-built packages are available for most platforms.
Install the version of scikit-learn provided by your operating system or Python distribution . This is a quick option for those who have operating systems or Python distributions that distribute scikit-learn. It might not provide the latest release version.
Building the package from source . This is best for users who want the latest-and-greatest features and aren’t afraid of running brand-new code. This is also needed for users who wish to contribute to the project.
Installing the latest release¶
In order to check your installation you can use
Note that in order to avoid potential conflicts with other packages it is strongly recommended to use a virtual environment (venv) or a conda environment.
Using such an isolated environment makes it possible to install a specific version of scikit-learn with pip or conda and its dependencies independently of any previously installed Python packages. In particular under Linux is it discouraged to install pip packages alongside the packages managed by the package manager of the distribution (apt, dnf, pacman…).
Note that you should always remember to activate the environment of your choice prior to running any Python command whenever you start a new terminal session.
If you have not installed NumPy or SciPy yet, you can also install these using conda or pip. When using pip, please ensure that binary wheels are used, and NumPy and SciPy are not recompiled from source, which can happen when using particular configurations of operating system and hardware (such as Linux on a Raspberry Pi).
Источник
ModuleNotFoundError: No module named ‘sklearn’
I want to import sklearn but there is no module apparently:
I am using Anaconda and Python 3.6.1 ; I have checked everywhere but still can’t find answers.
When I use the command: conda install scikit-learn should this not just work?
Where does anaconda install the package?
I was checking the frameworks in my python library and there was nothing about sklearn only numpy and scipy.
Please help, I am new to using python packages especially via anaconda.
15 Answers 15
You can just use pip for installing packages, even when you are using anaconda:
This should work for installing the package.
And for Python 3.x just use pip3 :
Will leave below two options that may help one solve the problem:
One might want to consider the notes at the end, specially before resorting to the 2nd option.
Option 1
If one wants to install it in the root and one follows the requirements — (Python ( >= 2.7 or >= 3.4 ), NumPy ( >= 1.8.2 ), SciPy ( >= 0.13.3 ).) — the following should solve the problem
Alternatively, as mentioned here, one can specify the channel as follows
Let’s say that one is working in the environment with the name ML.
Then the following should solve one’s problem:
Option 2
If the above doesn’t work, on Anaconda Prompt one can also use pip (here’s how to pip install scikit-learn), so the following may help
However, consider the last note below before proceeding.
Notes:
When using Anaconda, one needs to be aware of the environment that one is working.
Then, in Anaconda Prompt, one needs to run the following
$command — Command that one intends to use (consult documentation for general commands)
$ENVIRONMENT NAME — The name of one’s environment (if one is working in the root, conda $command $IDE/package/module is enough)
$IDE/package/module — The name of the IDE or package or module
If one needs to install/update packages, the logic is the same as mentioned in the introduction. If you need more information on Anaconda Packages, check the documentation.
pip doesn’t manage dependencies the same way conda does and can, potentially, damage one’s installation.
If you are using Ubuntu 18.04 or higher with python3.xxx then try this command
then try your command. hope it will work
I did the following:
I’ve tried a lot of things but finally, including uninstall with the automated tools. So, I’ve uninstalled manually scikit-learn.
And re-install using pip
Hope that can help someone else!
This happened to me, I tried all the possible solutions with no luck!
Finaly I realized that the problem was with Jupyter notebook environment, not with sklearn!
I solved the problem by re-installing Jupyter at the same environment as sklearn
the command is: conda install -c anaconda ipython . Done.
The other name of sklearn in anaconda is scikit-learn. simply open your anaconda navigator, go to the environments, select your environment, for example tensorflow or whatever you want to work with, search for scikit_learn in the list of uninstalled packages, apply it and then you can import sklearn in your jupyter.
The above did not help. Then I simply installed sklearn from within Jypyter-lab, even though sklearn 0.0 shows in ‘pip list’:
What I learned later is that pip installs, in my case, packages in a different folder than Jupyter. This can be seen by executing:
Once from within Jupyter_lab notebook, and once from the command line using ‘py notebook.py’.
In my case Jupyter list of paths where subfolders of ‘anaconda’ whereas Python list where subfolders of c:users[username].
On Windows, I had python 3+ version. pip version — 22.3.1
I had installed:
pip install sklearn
But, it seems it is deprecated with scikit-learn.
pip install scikit-learn
Cause
Conda and pip install scikit-learn under
/anaconda3/envs/$ENV/lib/python3.7/site-packages, however Jupyter notebook looks for the package under
Therefore, even when the environment is specified to conda, it does not work.
Solution
pip 3 install the package under
Verify
After pip3, in a Jupyter notebook.
/anaconda3/lib/python3.7/site-packages/sklearn/init.py’
I had the same problem. The issue is when we work on multiple anaconda environments, not all packages are installed in all environments. you can check your conda environment by writing the following code in anaconda prompt:
conda env list
then you can check the packages installed in each environment :
conda list -n NAME_OF_THE_ENVIRONMENT
for me, the environment that I was working with , was missing sklearn, although the package was installed in the other environments.
therefore, I just simply installed sklearn package in that particular environment
conda install -n NAME_OF_THE_ENVIRONMENT scikit-learn
and the issue was resolved
install these ==>> pip install -U scikit-learn scipy matplotlib if still getting the same error then , make sure that your imoprted statment should be correct. i made the mistike while writing ensemble so ,(check spelling) its should be >>> from sklearn.ensemble import RandomForestClassifier
I had the same issue as the author, and ran into the issue with and without Anaconda and regardless of Python version. Everyone’s environment is different, but after resolving it for myself I think that in some cases it may be due to having multiple version of Python installed. Each installed Python version has its own Libsite-packages folder which can contain a unique set of modules for that Python version, and where the IDE looks into folder path that doesn’t have scikit-learn in it.
One way to try solve the issue: you might clear your system of all other Python versions and their cached/temp files/system variables, and then only have one version of Python installed anywhere. Then install the dependencies Numpy and Scipy, and finally Scikit-learn.
More detailed steps:
- Uninstall all Python versions and their launchers (e.g. from Control Panel in Windows) except the one version you want to keep. Delete any old Python version folders in the Python directory —uninstalling doesn’t remove all files.
- Remove other Python versions from your OS’ Environment Variables (both under the system and user variables sections)
- Clear temporary files. For example, for Windows, delete all AppData Temp cache files (in C:UsersYourUserNameAppDataLocalTemp). In addition, you could also do a Windows disk cleanup for other temporary files, and then reboot.
- If your IDE supports it, create a new virtual environment in Settings, then set your only installed Python version as the interpreter.
- In your IDE, install the dependencies Scipy and Numpy from the module list first, then install Scikit-Learn.
As some others have suggested, the key is making sure your environment is set up correctly where everything points to the correct library folder on your computer where the Sklearn package is located. There are a few ways this can be resolved. My approach was more drastic, but it turns out that I had a very messy Python setup on my system so I had to start fresh.
Источник
How To Fix ModuleNotFoundError: No module named ‘sklearn’
Understanding how to properly install and import scikit-learn in Python
Introduction
People new to Python usually have troubles installing scikit-learn package, which is the de-facto Machine Learning library. A very common error when it comes to import the package in their source code is ModuleNotFoundError
This error indicates that the scikit-learn (aka sklearn ) package was not installed, or even if it was installed for some reason it cannot be resolved.
In today’s short tutorial, I’ll go through a few basic concepts when it comes to installing packages on Python that could eventually help you get rid of this error and start working on your ML projects. More specifically, we will discuss about
- the proper way for installing packages through pip
- how to upgrade scikit-learn to the latest version
- how to properly use virtual environments and manage package versions
- what to do if you are facing this issue with anaconda
- what to do if you are getting this error in a Jupyter notebook
Let’s get started!
Installing packages with pip the right way
In fact, you may have multiple Python versions installed on your local machine. Every time you install a package, this installation is associated with just a single version. Therefore, there’s a chance that you have installed scikit-learn for one Python version, but you are executing your source code using a different version and this may be the reason why scikit-learn cannot be found.
Therefore, make sure you use the correct command to install sklearn through pip . Usually, many users attempt to install packages using the command
Both of the above commands are going to install the specified package for the Python is associated with. For instance, you can find out by running
Instead, make sure you use the following notation when installing Python packages through pip
This is going to ensure that the package to be installed, will be available to the Python version you will be using to run your source code. You can find where that specific Python executable is located on your local machine by executing
Upgrade the package to the latest version
Additionally, it may be helpful to ensure that you are using the latest version of scikit-learn and not a pretty old one. To update to the most recent version available you can run the following command:
Using Virtual Environments
Python’s venv module allows the creation of the so-called virtual environments. Every virtual environment is completely isolated and has its own Python binary. Additionally, it may also have its own set of installed Packages within its own site directory.
This means that if a package is installed within a specific virtual environment, it won’t be visible to the system-wide installed packages or any other virtual environment.
If you are not currently using a virtual environment, I would advise you to start doing so as it will greatly help you manage package dependencies easier and more efficiently.
Going to our specific use-case now, if you wish to work on a project that requires scikit-learn then you essentially have to follow three steps.
First, create a virtual environment for your project and place it into your desired location. Let’s create one using the name my_project_venv
Now that the virtual environment has been created, you should now activate it. You can do so using the following command:
If the virtual environment has been activated successfully you should be able to see the venv name as a prefix in your command line (e.g. (my_project_venv) ).
Now you can finally install sklearn (and any other dependency you need to build your Python application) using the commands we discussed earlier.
and eventually execute your script
What to do if you are using anaconda
If you are currently working with conda, you may have to be careful as to which environment you are actually working with.
If you want to install scikit-learn at the root (probably not recommended since I mentioned about the importance of using isolated virtual environments for each project), then you could do so using
Alternatively, if you want to install the scikit-learn package to a specific anaconda environment, then you can use the -n flag to specify the environment name. For example, the following command will install scikit-learn to the conda environment called my_environment :
If none of the above approaches work, then you can still install scikit-learn through pip , even when working within a conda environment. In an anaconda prompt simply run
What to do if you are working with Jupyter
Finally, if you are getting this the ModuleNotFoundError in a Jupyter notebook, then you need to ensure that both Jupyter and scikit-learn are installed within the same environment.
The first step is to check the path to which the jupyter notebook is installed on your local machine. For example,
As mentioned already, it is much better (and will definitely save you time and energy) if you work in isolated environments.
If you are working in a specific conda environment, then make sure to install both the Jupyter Notebook and the scikit-learn package within the same environment:
If you are working in a Python virtual environment (aka venv ) then:
and finally open your Jupyter notebook from the activated environment and import scikit-learn . You should now be good to go!
Still having troubles?
If you are still having troubles importing scikit-learn then there’s probably something else going wrong. You may find your answer in one my articles
Источник
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and
privacy statement. We’ll occasionally send you account related emails.
Already on GitHub?
Sign in
to your account
Closed
lesteve opened this issue
Jan 19, 2017
· 36 comments
Closed
pip install sklearn behaviour
#8215
lesteve opened this issue
Jan 19, 2017
· 36 comments
Comments
I was surprised that pip install sklearn
would do something without giving an error, which I learned from
#8185 (comment). All in all it is not so bad because scikit-learn is in install_requires of sklearn setup.py so you do end-up with scikit-learn installed. Still it is a bit confusing that pip list
shows both scikit-learn 0.18.1 and sklearn 0.0.
@chrisfilo if I believe https://pypi.org/project/sklearn it looks like you uploaded it. I see a similar approach is done for bids and pybids https://pypi.org/project/bids. Is it a common thing to do prevent package names to be squatted on PyPI?
There must be a way to have a slightly clever setup.py that allows upload but prevents install with an error saying something like «Please use pip install scikit-learn».
Maybe you could make sklearn
an alias of scikit-learn
.
I went across the following package.
I didn’t check the mechanism thought.
Quickly looking at the pypi-alias code it may well be that pypi-alias was used to generate the sklearn PyPI package indeed.
Indeed you are right.
At the end, pip install sklearn
or pip install scikit-learn
— apart from the annoying sklearn (0.0)
showed in the pip list
— will install the latest available build from PyPI. I would have thought that this is the expected behaviour.
However, I could find a case which I find annoying. As a user, I would expect pip install sklearn==0.17
to do something as pip install scikit-learn==0.17
. This will throw an error since that sklearn
as only a (0.0)
version.
Regarding the issue in #8185, I was checking on the doc if it was mentioned that you should use pip install .
to trigger the install of the local version. I couldn’t find it. Did I miss it?
And I forgot that pip uninstall sklearn
will not do what you are expecting as well if you expect to remove scikit-learn.
rth
mentioned this issue
Aug 10, 2017
Maybe it would be wise to phase out the sklearn package.
How about:
- first allowing installation through
pip install sklearn
but with a warning, - then some time later don’t install and just show an error,
- and finally remove the package altogether.
My earlier comment is still valid:
There must be a way to have a slightly clever setup.py that allows upload but prevents install with an error saying something like «Please use pip install scikit-learn».
@hugovk if you can find or put together a setup.py that just raises an exception that would be a step forward on this issue.
Just to be clear I don’t think we should have a warning phase (since the problem is super easy to fix and to be honest I suspect very few people are doing pip install sklearn
). We should not remove the package it is there for a good reason: people might still do pip install sklearn
and/or it could be squatted by not so benevolent people.
Here’s PyPI downloads stats (via pypinfo) for the last month suggesting over a quarter of scikit-learn installs are through sklearn!
package | download_count |
---|---|
sklearn | 283,395 |
scikit-learn | 1,082,341 |
How about something like this in sklearn’s setup.py?
raise RuntimeError("sklearn is now known as scikit-learn. Please run: pip install scikit-learn")
Happy to transfer ownership of the sklearn pypi to someone more involved in the project.
My two cents: if it ain’t broken don’t fix it. I see little benefit from preventing people from using sklearn and a lot of broken CI setups and packages if you do implement this exception.
I also vaguely remember that we added this meta package because of some meta-programming pattern that was looking for packages, by the name of the the module (which is sklearn
). This would also break — sadly I don’t remember the details.
@lesteve do you have an account? @chrisfilo you could also transfer to me. (t3kcit I think).
I would argue that keeping it as an alias is fine. And yes, I think a lot of CI scripts will break and it’ll be no benefit to anyone imho.
Here are a few ways this is broken:
pip install sklearn==0.18.1
Output:
Could not find a version that satisfies the requirement sklearn==0.18.1 (from versions: 0.0)
No matching distribution found for sklearn==0.18.1
pip install sklearn
pip uninstall -y sklearn
python -c "import sklearn" # Oh wait why does this work I thought I just uninstalled sklearn
pip list ouput is also is misleading (why do I have two versions of scikit-learn and why is one 0.0, which was actually the question from #8185 (comment)):
scikit-learn (0.19.1)
sklearn (0.0)
If we believe the numbers from PyPI (1 in 5 scikit-learn pip install is through pip install sklearn
really ???), maybe it’s worth having a warning for a while. Because the fix is so simple and could be very well explained in the error message I would slightly be in favour of just raising an exception.
My two cents: if it ain’t broken don’t fix it. I see little benefit from preventing people from using sklearn and a lot of broken CI setups and packages if you do implement this exception.
@chrisfilo I tend to be with you on this but in this case I am not, not sure exactly why. Probably part of the story is the mix of surprise and WTF I felt when I realised pip install sklearn
was actually working. It’s a good point that sklearn in requirements.txt or setup.py could be an issue. Maybe it is worth setting using github advanced search or using the github API to figure out how much people do that.
@lesteve do you have an account? @chrisfilo you could also transfer to me. (t3kcit I think).
I do have an account on PyPI which is lesteve.
Ownership transferred. Remember: With great power comes… ability to mess up a lot of packages
pip install -U scikit-learn==0.18
rth
mentioned this issue
Feb 6, 2018
For future reference, as I couldn’t find a version under version control, I have made a copy of the current contents of https://pypi.org/project/sklearn/ under https://github.com/rth/scikit-learn-invalid-pypi
The consensus about this issue seems to be not to do anything, but if we decide to change something in that PyPi package in the future, please make a PR there first and ping relevant maintainers so the changes can be reviewed / commented on before uploading to PyPi.
For future reference, as I couldn’t find a version under version control, I have made a copy of the current contents of https://pypi.org/project/sklearn/ under https://github.com/rth/scikit-learn-invalid-pypi
Thanks! For compleness’ sake, last time I looked I got the feeling that they were done through a pypi alias tool, maybe https://pypi.org/project/pypi-alias?
The consensus about this issue seems to be not to do anything,
Although it makes me slightly sad, I think that this is not worth the pain indeed.
If someone has the time, it would be great to:
- look at PyPI downloads to see whether pip install sklearn behaviour #8215 (comment) is still true
- see whether a github search can help evaluating the number of packages using the sklearn package on PyPI, e.g. how many packages have sklearn in their setup.py
Other ideas:
- adding a warning in setup.py could be a middle ground solution and reevaluating in a few years. I doubt a warning would have significant impact though
- coupling the breaking change to a scikit-learn release so that we can advertise it and people are more aware of it.
Just for the story, I was on the wrong side on msgpack-python being renamed to msgpack on PyPI and this was clearly not my definition of fun (bonuses were: interaction between conda keeping the old name and pip, different behaviour whether you install a wheel or from source, see dask/distributed#1913 (comment) for the short version if you really care).
An issue that could cause significant confusion is that if a new version is released after you install using this alias, pip install --upgrade sklearn
fails silently (the only requirement is some version of scikit-learn, not necessarily the most recent version). I suppose it would be possible to continually release new versions of sklearn requiring the most recent version of scikit-learn, but that would be absurd
Another possibility without breaking peoples code could be to detect whether the sklearn
package is installed (e.g. by looking for site-packages/sklearn-0.0.dist-info/
or with some more standard python way), and raise a warning at import time if that is the case. This may add some small overhead for the import though.
for reference from airflow
import os from setuptools import setup, find_packages with open("README.md", "r") as fh: long_description = fh.read() setup( name="airflow", version="0.6", author="Apache Author", author_email="dev@airflow.incubator.apache.org", description="Placeholder for the old Airflow package", long_description=long_description, long_description_content_type="text/markdown", url="https://github.com/apache/incubator-airflow", packages=find_packages(), classifiers=[ 'Development Status :: 5 - Production/Stable', 'Environment :: Console', 'Environment :: Web Environment', 'Intended Audience :: Developers', 'Intended Audience :: System Administrators', 'License :: OSI Approved :: Apache Software License', 'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.5', 'Topic :: System :: Monitoring', ] ) if not os.getenv("OVERRIDE"): raise RuntimeError('Please install package apache-airflow instead of airflow')
felker
added a commit
to PPPLDeepLearning/plasma-python
that referenced
this issue
Oct 14, 2019
maresb
added a commit
to maresb/lattice
that referenced
this issue
Apr 21, 2021
This issue came up with a member of a slack group I’m in. The brief context was they found an issue here that fixed their problem, which was needing to upgrade one of their development environments. Both they and I were confused about the behavior they were getting with variations of pip install --upgrade [mysterious options available here]
such as sklearn
or scikit-learn
after finding the package description.
I don’t really have a pony in this race (I don’t use this package), but
- This issue continues to come up.
- The linked GitHub repository no longer exists (https://github.com/rth/scikit-learn-invalid-pypi).
- This discussion is hard to find if you are not familiar with python package development (what a
setup.py
is, how to obtain thesetup.py
for v0.0, find the right part of that code to search the internet for…).
You should break the world once and for all (edit: for clarity, raise an installation error on sklearn
), and force people to install the correct package name. You should absolutely keep the package name and I’m sure PyPI would be ok with this, because malevolent code opportunities for this package are abundant.
Update: Alternatively, link to this discussion from the long_description
on sklearn
since people hit by this do find the PyPI page. And in either case close this issue 🙂
I guess maybe the 1.0 release could be an opportunity to do this cleanup + breaking change?
I am imagining something like to make the disruption a bit smoother:
- soon: uploading sklearn 0.1 on PyPI which raises an error explaining the change and allow an easy way out through an environment variable (better suggestions welcome for the easy way out)
- in combination with the scikit-learn 1.0 release (I guess in a few months?): upload sklearn 0.2 on PyPI which always raises an error (no easy way out). Optionally remove sklearn 0.1 from PyPI and maybe sklearn 0.0.
Alternatives considered but discarded: add a warning in setup.py, the warning will only be visible when using pip install -v
(see https://stackoverflow.com/a/44617404) and will probably be ignored anyway (pip install -v
is very verbose) …
One possible thing to look at is sklearn dependents from https://libraries.io/pypi/sklearn. One caveat is that there are plenty of false positives which seems like historical dependents of sklearn. For example pycaret is listed probably because sklearn was amongst the dependencies at one point (2 years ago):
https://github.com/pycaret/pycaret/blob/b9756afeae6b24107d86c74544deb83fe5527b50/requirements.txt#L7
but this has been fixed since.
For completeness, I checked the numbers pypistats.org, roughly 29% of the scikit-learn installs come from the sklearn
package … which is in the same ballpark as #8215 (comment). I find this both amazing and slightly sad at the same time 😉.
sklearn
https://pypistats.org/packages/sklearn
Downloads last month: 6,528,058
scikit-learn
https://pypistats.org/packages/scikit-learn
Downloads last month: 22,800,647
It would be good to fix this. I think if we error we always need a way out, because some popular but no longer maintained libraries can have sklearn
in their list of dependencies and there is not much one can do about it.
Maybe we could make 1 sklearn
release,
- that changes the description in https://pypi.org/project/sklearn/ to be much more verbose and explicitly say that bad things will happen if you use it.
- Always keep the env variable to disable the error
- say starting from September 1st start producing an error each Monday between 3PM CEST to 7PM CEST . This should raise enough attention so that issues are opened in popular libraries and this is fixed.
- Then progressively increase the frequency of failure, from once per week to every other day, every day etc, with a sufficiently long period. All this can be defined initially in that 0.1 sklearn release.
This would give time libraries that have sklearn
in their dependencies to update and make a new release. I forgot what’s the name for such voluntary intermittent failures is.
I think it would be better to do this completely independently from any scikit-learn release, to avoid implying this is somehow a breaking change / regression, though we can mention it in release notes.
Yes, thanks. For some reason my brain only suggested hashbrown when I tried to find the term
I think if we error we always need a way out, because some popular but no longer maintained libraries can have sklearn in their list of dependencies and there is not much one can do about it.
Good point, this also applies to old package versions that at one point required sklearn and unfortunately we can not fix the past 😉.
I think it would be better to do this completely independently from any scikit-learn release, to avoid implying this is somehow a breaking change / regression, though we can mention it in release notes.
OK I am fine with this.
About the brownout, why not … I’d rather have something simpler if possible. It feels slightly weird to have something failing and then working the next day (I do get the increasing failure frequency thing but still …). IMO one of the reason it works for github is that they can detect you are still not using the recommended way and they can send you emails from time to time to nudge you into the right direction (e.g. creating a token to use git on the command line if you are using https).
So maybe a fixed schedule with a few different environment variables like IGNORE_SKLEARN_PACKAGE_WARNING, IGNORE_SKLEARN_PACKAGE_SECOND_WARNING, IGNORE_SKLEARN_PACKAGE_LAST_WARNING … one small advantage is that then you can easily google who is actually using these environment variables (I am slightly annoyed I was not able to find a good way to know who is the root cause of all of these sklearn package PyPI downloads, it must be the CI of a popular package and of its dependents …).
IMO one of the reason it works for github is that they can detect you are still not using the recommended way and they can send you emails from time to time to nudge you
Brownout is the nudge. Other per-user notifications are not expected. PyPi did a similar thing for deprecating non SNI compatible clients pypi/support#978. It quite suitable for situations where you want some change to happen, make sure users are aware but don’t want to make it a hard blocker either. It also allows one to monitor the migration as it happens (in our case PyPi stats) and adjust the schedule if necessary.
If we just fail permanently, it becomes a blocker for all maintainers who happened to depend on sklearn. They would be expected to drop everything and start working on their CI, updating documentation, responding to user complaints etc. Which I find is not very nice, particularly that there is no urgency to fix this. In our case, we don’t really care if this migration happens even say by mid 2022 as long as it does happens.
So maybe a fixed schedule with a few different environment variables like IGNORE_SKLEARN_PACKAGE_WARNING, IGNORE_SKLEARN_PACKAGE_SECOND_WARNING, IGNORE_SKLEARN_PACKAGE_LAST_WARNING
That would be a bit confusing for downstream packages to document IMHO I would be rather +1 for a single env variable.
I think we can search for packages that currently use sklearn as a dependency with BigQuery, assuming one can filter files to requirements*.txt
, setup.py
only and search inside the file contents.
+1 for the brownout strategy. We can setup a wiki page or an issue that documents the planned brownout schedule and announce it on twitter + the mailing list.
We should discuss this on the next online dev meeting.
This was referenced
Jun 24, 2021
sklearn had 7,232,064 downloads over 30 days.
I’ve created PRs for four packages which had a total 7,039,690 downloads over 30 days:
- 6,714,326 downloads: Replace sklearn with scikit-learn jsonpickle/jsonpickle#360
- 173,025 downloads: Replace sklearn with scikit-learn bmabey/pyLDAvis#212
- 87,089 downloads: Replace sklearn with scikit-learn databand-ai/dbnd#51
- 65,250 downloads: Replace sklearn with scikit-learn SauceCat/PDPbox#77
These don’t necessarily account for 7,039,690 sklearn downloads, but it’ll help!
Very nice indeed @hugovk! Out of curiosity (and in case we need to do it again in the future), how did you find the sklearn dependents with the most PyPI dowloads?
I knocked up a script to find which sdist and wheel packages in https://github.com/DavHau/pypi-deps-db have sklearn as a dependency, and filtered out only those in the top 4k most-downloaded: https://github.com/hugovk/top-pypi-packages:
jsonpickle {'jsonpickle-1.5.2-py2.py3-none-any.whl'}
xgboost {'xgboost-1.0.0rc1-py2.py3-none-manylinux1_x86_64.whl'}
shap {'0.5'}
hmmlearn {'hmmlearn-0.2.0-cp34-cp34m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl', 'hmmlearn-0.2.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl', 'hmmlearn-0.2.0-cp33-cp33m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl'}
moviepy {'moviepy-0.2.3.2-py2.py3-none-any.whl'}
fastai {'0.7.0'}
pyldavis {'3.3.0'}
dbnd {'dbnd-0.42.1-py2.py3-none-any.whl'}
pdpbox {'0.2.1'}
datarobot {'datarobot-2.22.0-py2-none-any.whl', 'datarobot-2.25.0-py3-none-any.whl'}
caer {'1.3.6'}
flair {'0.4.1'}
recordlinkage {'0.2'}
Some of these are older releases, and they’ve since replaced sklearn. I made PRs for the rest, except for https://pypi.org/project/datarobot/ (108k downloads in 30 days) which might be closed source (or I couldn’t find the repo). Could be worth emailing them.
Here’s also a list of all matching packages: sklearn-all.txt
(To run the script, clone the repo and check the setup info in the top-level docstring. Likely Python 3.9+ but easy to adjust for 3.6+. If anyone would like a hand running the script, feel free to open an issue in its repo!)
except for pypi.org/project/datarobot (108k downloads in 30 days) which might be closed source (or I couldn’t find the repo). Could be worth emailing them.
Email sent. It’s only used in their extra examples
, so most of their users probably wouldn’t download it.
except for pypi.org/project/datarobot (108k downloads in 30 days) which might be closed source (or I couldn’t find the repo). Could be worth emailing them.
Email sent. It’s only used in their extra
examples
, so most of their users probably wouldn’t download it.
Speaking as one of the maintainers, we just released datarobot
v2.26.0 which is no longer dependent on sklearn
. Same with our related datarobot-early-access
package in v2.26.0.2021.9.13. Thanks for the heads up.
For reference, there is a anti-typo-squatting package for NLTK that performs the «fail with helpful error» strategy: ntlk
rth
mentioned this issue
Aug 19, 2022
I have created a repo to to implement the sklearn
package deprecation brownout:
https://github.com/scikit-learn/sklearn-pypi-package
Feed-back on the repo, more than welcome. The start date is currently set to November 1st with a brownout period of one year but this is open to discussion.
The error “ModuleNotFoundError: No module named sklearn» is a common error experienced by data scientists when developing in Python. The error is likely an environment issue whereby the scikit-learn package has not been installed correctly on your machine, thankfully there are a few simple steps to go through to troubleshoot the problem and find a solution.
Your error, whether in a Jupyter Notebook or in the terminal, probably looks like one of the following:
No module named 'sklearn'
ModuleNotFoundError: No module named 'sklearn'
In order to find the root cause of the problem we will go through the following potential fixes:
- Upgrade pip version
- Upgrade or install scikit-learn package
- Check if you are activating the environment before running
- Create a fresh environment
- Upgrade or install Jupyer Notebook package
Are you installing packages using Conda or Pip package manager?
It is common for developers to use either Pip or Conda for their Python package management. It’s important to know what you are using before we continue with the fix.
If you have not explicitly installed and activated Conda, then you are almost definitely going to be using Pip. One sanity check is to run conda info
in your terminal, which if it returns anything likely means you are using Conda.
Upgrade or install pip for Python
First things first, let’s check to see if we have the up to date version of pip installed. We can do this by running:
pip install --upgrade pip
Upgrade or install scikit-learn package via Conda or Pip
The most common reason for this error is that the scikit-learn package is not installed in your environment or an outdated version is installed. So let’s update the package or install it if it’s missing.
For Conda:
# To install in the root environment
conda install -c anaconda scikit-learn
# To install in a specific environment
conda install -n MY_ENV scikit-learn
For Pip:
# To install in the root environment
python3 -m pip install -U scikit-learn
# To install in a specific environment
source MY_ENV/bin/activate
python3 -m pip install -U scikit-learn
Activate Conda or venv Python environment
It is highly recommended that you use isolated environments when developing in Python. Because of this, one common mistake developers make is that they don’t activate the correct environment before they run the Python script or Jupyter Notebook. So, let’s make sure you have your correct environment running.
For Conda:
conda activate MY_ENV
For virtual environments:
source MY_ENV/bin/activate
Create a new Conda or venv Python environment with scikit-learn installed
During the development process, a developer will likely install and update many different packages in their Python environment, which can over time cause conflicts and errors.
Therefore, one way to solve the module error for sklearn is to simply create a new environment with only the packages that you require, removing all of the bloatware that has built up over time. This will provide you with a fresh start and should get rid of problems that installing other packages may have caused.
For Conda:
# Create the new environment with the desired packages
conda create -n MY_ENV python=3.9 scikit-learn
# Activate the new environment
conda activate MY_ENV
# Check to see if the packages you require are installed
conda list
For virtual environments:
# Navigate to your project directory
cd MY_PROJECT
# Create the new environment in this directory
python3 -m venv MY_ENV
# Activate the environment
source MY_ENV/bin/activate
# Install scikit-learn
python3 -m pip install scikit-learn
Upgrade Jupyter Notebook package in Conda or Pip
If you are working within a Jupyter Notebook and none of the above has worked for you, then it could be that your installation of Jupyter Notebooks is faulty in some way, so a reinstallation may be in order.
For Conda:
conda update jupyter
For Pip:
pip install -U jupyter
Best practices for managing Python packages and environments
Managing packages and environments in Python is notoriously problematic, but there are some best practices which should help you to avoid package the majority of problems in the future:
- Always use separate environments for your projects and avoid installing packages to your root environment
- Only install the packages you need for your project
- Pin your package versions in your project’s requirements file
- Make sure your package manager is kept up to date
References
Conda managing environments documentation
Python venv documentation
A common error you may encounter when using Python is modulenotfounderror: no module named ‘sklearn’. This error occurs when Python cannot detect the Scikit-learn library in your current environment, and Scikit-learn does not come with the default Python installation. This tutorial goes through the exact steps to troubleshoot this error for the Windows, Mac and Linux operating systems.
Table of contents
- ModuleNotFoundError: no module named ‘sklearn’
- What is ModuleNotFoundError?
- What is Scikit-learn?
- How to install Scikit-learn on Windows Operating System
- How to install Scikit-learn on Mac Operating System
- How to install Scikit-learn on Linux Operating System
- Installing pip for Ubuntu, Debian, and Linux Mint
- Installing pip for CentOS 8 (and newer), Fedora, and Red Hat
- Installing pip for CentOS 6 and 7, and older versions of Red Hat
- Installing pip for Arch Linux and Manjaro
- Installing pip for OpenSUSE
- Check Scikit-Learn Version
- Installing Scikit-Learn Using Anaconda
- Prerequisites Before Using Scikit-Learn
- Summary
ModuleNotFoundError: no module named ‘sklearn’
What is ModuleNotFoundError?
The ModuleNotFoundError occurs when the module you want to use is not present in your Python environment. There are several causes of the modulenotfounderror:
The module’s name is incorrect, in which case you have to check the name of the module you tried to import. Let’s try to import the re module with a double e to see what happens:
import ree
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
1 import ree
ModuleNotFoundError: No module named 'ree'
To solve this error, ensure the module name is correct. Let’s look at the revised code:
import re
print(re.__version__)
2.2.1
You may want to import a local module file, but the module is not in the same directory. Let’s look at an example package with a script and a local module to import. Let’s look at the following steps to perform from your terminal:
mkdir example_package
cd example_package
mkdir folder_1
cd folder_1
vi module.py
Note that we use Vim to create the module.py file in this example. You can use your preferred file editor, such as Emacs or Atom. In module.py, we will import the re module and define a simple function that prints the re version:
import re
def print_re_version():
print(re.__version__)
Close the module.py, then complete the following commands from your terminal:
cd ../
vi script.py
Inside script.py, we will try to import the module we created.
import module
if __name__ == '__main__':
mod.print_re_version()
Let’s run python script.py from the terminal to see what happens:
ModuleNotFoundError: No module named 'module'
To solve this error, we need to point to the correct path to module.py, which is inside folder_1. Let’s look at the revised code:
import folder_1.module as mod
if __name__ == '__main__':
mod.print_re_version()
When we run python script.py, we will get the following result:
2.2.1
Lastly, you can encounter the modulenotfounderror when you import a module that is not installed in your Python environment.
What is Scikit-learn?
Scikit-learn is a Python module for machine learning. The library is mainly written in Python and is built on NumPy, SciPy, and Matplotlib. The simplest way to install Scikit-learn is to use the package manager for Python called pip. The following instructions to install Scikit-learn are for the major Python version 3.
How to install Scikit-learn on Windows Operating System
You need to download and install Python on your PC. Ensure you select the install launcher for all users and Add Python to PATH checkboxes. The latter ensures the interpreter is in the execution path. Pip is automatically on Windows for Python versions 2.7.9+ and 3.4+.
You can install pip on Windows by downloading the installation package, opening the command line and launching the installer. You can install pip via the CMD prompt by running the following command.
python get-pip.py
You may need to run the command prompt as administrator. Check whether the installation has been successful by typing.
pip --version
To install Scikit-learn with pip, run the following command from the command prompt.
pip install -U scikit-learn
How to install Scikit-learn on Mac Operating System
Open a terminal by pressing command (⌘) + Space Bar to open the Spotlight search. Type in terminal and press enter. To get pip, first ensure you have installed Python3.
You can install Python3 by using the Homebrew package manager:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
export PATH="/usr/local/opt/python/libexec/bin:$PATH"
# if you are on macOS 10.12 (Sierra) use `export PATH="/usr/local/bin:/usr/local/sbin:$PATH"`
brew update
brew install python # Python 3
python3 --version
Python 3.8.8
Download pip by running the following curl command:
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
The curl command allows you to specify a direct download link, and using the -o option sets the name of the downloaded file.
Install pip by running:
python3 get-pip.py
From the terminal, use pip3 to install Scikit-learn:
pip install -U scikit-learn
How to install Scikit-learn on Linux Operating System
All major Linux distributions have Python installed by default. However, you will need to install pip. You can install pip from the terminal, but the installation instructions depend on the Linux distribution you are using. You will need root privileges to install pip. Open a terminal and use the commands relevant to your Linux distribution to install pip.
Installing pip for Ubuntu, Debian, and Linux Mint
sudo apt install python-pip3
Installing pip for CentOS 8 (and newer), Fedora, and Red Hat
sudo dnf install python-pip3
Installing pip for CentOS 6 and 7, and older versions of Red Hat
sudo yum install epel-release
sudo yum install python-pip3
Installing pip for Arch Linux and Manjaro
sudo pacman -S python-pip
Installing pip for OpenSUSE
sudo zypper python3-pip
Once you have installed pip, you can install Scikit-learn using:
pip install -U scikit-learn
Check Scikit-Learn Version
Once you have successfully installed Scikit-learn, you can use two methods to check the version of Scikit-learn. First, you can use pip show from your terminal.
pip show scikit-learn
Name: scikit-learn
Version: 0.24.1
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: None
Author-email: None
License: new BSD
Location: /Users/Yusufu.Shehu/opt/anaconda3/lib/python3.8/site-packages
Requires: threadpoolctl, numpy, scipy, joblib
Required-by: mlxtend, imbalanced-learn
Second, within your python program, you can import Scikit-Learn and then reference the __version__ attribute:
import sklearn
print(sklearn.__version__)
0.24.1
Installing Scikit-Learn Using Anaconda
Anaconda is a distribution of Python and R for scientific computing and data science. You can install Anaconda by going to the installation instructions. Once you have installed Anaconda, you can install Scikit-learn using the following command:
conda install -c conda-forge scikit-learn
Prerequisites Before Using Scikit-Learn
Before you can start using the latest release of scikit-learn, you must have the following installed:
- Python (>= 3.5)
- NumPy (>= 1.11.0)
- SciPy (>= 0.17.0)
- Joblib (>= 0.11)
- Matplotlib (>= 1.5.1) required for Scikit-Learn plotting capabilities
- Pandas (>= 0.18.0) is required for Scikit-learn data structure and analysis
Summary
Congratulations on reading to the end of this tutorial. The modulenotfounderror occurs if you misspell the module name, incorrectly point to the module path or do not have the module installed in your Python environment. If you do not have the module installed in your Python environment, you can use pip to install the package. However, you must ensure you have pip installed on your system. You can also install Anaconda on your system and use the conda install command to install the Scikit-learn library.
You may encounter a ModuleNotFoundError when trying to use a class or function from a module in Scikit-Learn. To solve this error, go to the article: How to Solve ModuleNotFoundError: No module named ‘sklearn.cross_validation’.
For further reading on installing data science and machine learning libraries, you can go to the articles:
- OpenCV: How to Solve Python ModuleNotFoundError: no module named ‘cv2’
- Requests: How to Solve Python ModuleNotFoundError: no module named ‘requests’
- Pandas: How to Solve Python ModuleNotFoundError: no module named ‘pandas’
- Matplotlib: How to Solve Python ModuleNotFoundError: no module named ‘matplotlib’
- Numpy: How to Solve Python ModuleNotFoundError: no module named ‘numpy’
- Imbalanced-learn: How to Solve Python ModuleNotFoundError: no module named ‘imblearn’
Go to the online courses page on Python to learn more about Python for data science and machine learning.
Have fun and happy researching!
- Causes of
ImportError: No module named sklearn
in Python - Fix
ImportError: No module named sklearn
in Python - Installation of
sklearn
Module UsingPIP
in Python - Installation of
sklearn
Module UsingConda
- Import
sklearn
and Check Its Version in Python - Conclusion
In Python, sklearn
is used as a machine learning tool for creating programs on regression, cluster, etc. A lot of times, importing it throws an error — No module named sklearn
.
This means the system cannot find it due to a bad installation, invalid Python or pip version, or other problems.
Causes of ImportError: No module named sklearn
in Python
Suppose we install sklearn
or any Python library into the system. The system says that the library has been installed successfully.
But when we import the library, we see the error — No module named 'sklearn'
.
This can happen for mainly two reasons.
- The library did not get installed successfully.
- It was installed into an unknown directory that the system cannot find.
This article explains how to install sklearn
into the system correctly.
Fix ImportError: No module named sklearn
in Python
Installing sklearn
into the Windows system is one of the methods through which we can solve the ImportError: No module named sklearn
error in Python. There are two prerequisites for it.
- Python
pip
Knowing the pip
and Python versions beforehand is not only helpful but advantageous. Bad installation happens when the Python version does not match the pip
version inside the system.
There are three different ways to install Python on Windows, whichever seems suitable.
- Installing from an executable (
.exe
) installer - Getting it from the Microsoft store
- Getting a subsystem Linux distro inside Windows
Only the first two alternatives, the most common installation techniques in a Windows system, will be the subject of this section. There are differences between the two official Python installers for Windows and some significant restrictions on the Microsoft Store bundle.
Installation From the Microsoft Store
The Microsoft Store bundle is the greatest option for a hassle-free setup. This has two main steps.
-
Search for Python
Search for Python through the Microsoft Store application. Several versions will likely be encountered that can be installed.
To access the installation screen, choose Python 3.8 or the highest version number found in the installer.
Another way is to launch PowerShell and enter the following command.
Pressing Enter will launch the Microsoft Store and direct the browser to the most recent version of Python available if it is not installed on the computer.
-
Install the Python app.
Once the version is chosen, the steps to finish the installation are as follows.
-
Select
Get
. -
Await the download of the application. The
Install on my devices
button will appear in place of theGet
button once the download is complete. -
Select the devices you want to finish installing, then click
Install on my devices
. -
To begin the installation, click
Install Now
and thenOK
. -
If the installation was successful, the Microsoft Store page will display
"This product is installed"
at the top.
Installation Using an Executable Installer
- Download the Full Installer
To download the whole installer, adhere to the following steps.
- Go to the Python.org Downloads page for Windows by opening a browser window.
- Click the link for the Most Recent Python 3 Release — Python 3.x.x under the “Python Releases for Windows” category. Python 3.10.5 was the most recent version as of this writing.
- Either choose Windows x86-64 executable installer for 64-bit or Windows x86 executable installer for 32-bit by scrolling to the bottom.
Go to the next step after the installer has finished downloading.
- Run the installer
After selecting and downloading an installer, double-click the downloaded file to launch it. You’ll see a dialogue box similar to the one below.
There are four things to notice about this dialog box.
-
The standard Windows user’s
AppData/
directory contains the default install path. -
The
pip
and IDLE installation locations and extra features can be altered using theCustomize installation
button. -
The default checkbox next to
Install launcher for all users (recommended)
is selected. This implies that thepy.exe
launcher will be accessible to all users on the system.To make the installation available just for the current user, uncheck the option.
-
The
Add Python 3.10 to PATH
checkbox is not selected. Make sure you understand the ramifications before checking this box because there are several reasons why you would not want Python onPATH
.
The installation process using the full installer gives complete control over it. Following these steps will install Python on the system.
Now the installation of sklearn
can be initiated to solve the No module named 'sklearn'
error.
Installation of sklearn
Module Using PIP
in Python
Launch a command-line runner like PowerShell or the command prompt to install sklearn
this way.
How to start PowerShell is as follows:
- Go to start by hitting the Windows button.
- Type
PowerShell
(or Command prompt/cmd
, if PowerShell is not installed). - Click on it or press the Enter key.
PowerShell can be accessed as an alternative by right-clicking and selecting Open here as an administrator
.
Inside the command line runner, the first step is to install pip
.
Pip
is a Python package manager that downloads and installs Python libraries inside the system. To install pip
, type the command:
Once Python and pip
are set up, sklearn
can be installed. Type the command:
pip3 install scikit-learn
This will install the scikit-learn
directory from the pip
repositories.
If the system already has an older version of Python, or sklearn
is required for a previous version like Python2, use the command:
pip2 install scikit-learn
If all the steps are executed correctly, the No module named 'sklearn'
error would be resolved.
Once it is installed successfully, we can check its version by using:
It can be seen in the above image that the system can detect sklearn
, which means our error — No module named 'sklearn'
now stands resolved, and now it can be imported.
Installation of sklearn
Module Using Conda
Anaconda is a Python distribution that executes Python scripts inside a virtual environment. It can also install and manage Python library packages using conda
, an alternative to pip
.
This is an alternative method to install sklearn
into the system in a targeted environment to resolve the No module named 'sklearn'
issue.
To install sklearn
in Anaconda, open Anaconda3
in an administrative mode. After Anaconda is launched, open a command line prompt like cmd
or PowerShell from inside Anaconda
.
Opening a command line runner will load it inside a virtual environment of Anaconda, and all the packages installed will be made exclusively available for Anaconda only.
There are a few checkpoints that need to be ensured to install sklearn
, which are:
- Checking
conda
’s version - Updating
conda
- Installing
sklearn
- Creating and activating a
sklearn
virtual environment.
Inside the prompt, write the command:
conda -V
This will display the current version of conda
. Then it needs to be updated using the command:
conda update conda
It asks for a y/n
after displaying the list of packages that need to be installed. To proceed, type y
and press Enter.
Conda
will update itself with all the necessary packages.
Now, to install sklearn
, use the command:
conda create -n sklearn-env -c conda-forge scikit-learn
The above command directs the prompt to create a virtual environment named sklearn-env
and install scikit learn
and all the dependencies within its virtual environment.
In Python, a virtual environment is simply an isolated directory that stores specific libraries and a version of Python so that it does not interfere with other libraries and other Python versions.
Entering the above command will display a list of available packages with sklearn
that must be downloaded and installed. It asks for a y/n
prompt, and entering y
will install all the displayed packages.
After completing it, scikit learn
gets downloaded and installed into the Anaconda.
Note: A
conda
HTTP error might occur while installing one of the above packages if the download timeouts from the server. This specifically happens while Python gets downloaded.
The prompt displays the file path of the bad installation; go to the file path, delete the directory with the faulty download, and retry the above command.
This time it will get completed successfully.
Though sklearn
is installed, it cannot be directly used. The reason behind it is that it was installed inside a virtual environment that needs to be activated before.
Type the command:
conda activate sklearn-env
This will activate the sklearn
virtual environment, and now sklearn
can be imported and used. If all the steps are correctly followed, the No module named 'sklearn'
error would be resolved by now.
In the next section, we will check sklearn
’s version and import it.
Import sklearn
and Check Its Version in Python
To check the version of sklearn
, type the command:
conda list scikit-learn
To check all the installed packages, use the command:
conda list
To check the version of sklearn
along with its dependencies, type:
python -c "import sklearn; sklearn.show_versions()"
Now we know that sklearn
is installed and working successfully, we can import it by:
import sklearn
If it shows no errors, that means sklearn
is working correctly.
If you again get the error — No module named 'sklearn'
, remove the virtual environment and packages, restart your system and repeat the above steps once again to get it installed correctly.
Conclusion
This article explains various methods to install sklearn
into the system and resolve the No module named 'sklearn'
error in Python. After reading this article, the reader will be able to install sklearn
easily in different kinds of systems and setups.
It is hoped that this article helped in your learning journey.