-->

Thursday, January 30, 2014

Installing the NLTK: Tips and tricks

Installing the NLTK: 

Tips and Tricks

Though the NLTK is a very useful tool aimed at students, sometimes its installation can be a bit tricky. Some people recommend that you use a pre-bundled version such as Enthought, but I have not had the chance to use it yet.These tips might help you if you are having trouble:
  • If you are getting an error when trying to install Numpy, NLTK or YAML on a 64-bit version of Windows complaining that Python has not been found, check that you have installed a 32-bit version of Python (x86). Also, bear in mind that you need different downloads for Python 2.x and for Python 3.x. If you are going to use Python 2.x, I recommend that you use Python 2.7.
  • I highly recommend that you download and install setuptools and pip. Setuptools includes easy_install, which will make your life so much easier. They can help you when the binaries are defective or outdated. These tools are aimed at simplifying your work when you need to compile from source. To install setuptools, follow the instructions on this link. Pip installation instructions can be found here. Basically, you just download the .py file and run it. 
Warning: To be able to run it, make sure that you have added C:\Python27 (or whatever your Python installation directory is) and C:\Python27\Scripts to your PATH variable. Instructions on how to do this (and how to fix your PATH variable if it is broken) can be found on this post.
  •  You won't notice any big changes, but you can check that it worked by trying to install a package such as matplotlib. You need to use Windows' Command Line Interface for this (Start menu>Run>cmd.exe). Just type this on the prompt and hit Return:
pip install matplotlib
or, if you prefer to use easy install:
easy_install matplotlib
You will often see it written as:
sudo easy_install matplotlib
sudo pip install matplotlib
The 'sudo' is meant for Mac and Linux systems, not Windows systems. In order to know if everything went well, open up IDLE and try to import the package you just installed, e.g.:
 >>> import matplotlib 
If Python doesn't complain about a package not being found and simply shows you another line of the prompt, the installation went well. 
  • I found that matplotlib, which is used throughout the book to plot graphs, gave some problems after installing from the binaries (it would install correctly but NLTK was having trouble with it), so I finally installed from source. This solved the problem.
  • Sometimes pip or easy_install will complain about a lack of vcvarsall.bat.This means that easy_install or pip need a C compiler in order to compile part of the code you asked it to install. To fix this, you need to install Microsoft Visual Studio and add it to your path. These StackOverflow answers should help, but make sure that you know what C compiler your Python version uses, in order to avoid problems (they are not the same).

No comments:

Post a Comment