Its been over 3 months for me since i quit my full time job. One of the area which fascinated me all this while was data analysis. Over coffee Anurag Ramdasan, a friend and mentor few months back suggested me to check https://data.gov.in/
After reading through and spending time on YouTube, Python Pandas felt like the package worth learning, spending time on.
So, What exactly is Pandas?
Pandas is a Python library used extensively for data analysis, with millions of tables and rows. Python being one of the easiest language to pick and wide community support, makes an excellent choice. You can find more details and use case of it on its official website http://pandas.pydata.org/ Also did I tell you, it is open source. 🙂
Installing Pandas is very easy, all you need to do is to use pip or anaconda, popular Python package managers.
conda install pandas
pip install pandas
Alternatively check the install guide if you want to install binary packages for your operating system or source code from: http://pandas.pydata.org/pandas-docs/stable/install.html
After downloading the CSV file which has percentage literacy rate mentioned from year 1951 to 2011 on every 10 years basis. The idea or the task i felt how can i get a table with state wise literacy rate (%) for Bihar, Gujarat and Assam (These are one of the many states in India) in the year 1991, 2001, 2011.
I have embedded iPython notebook below, which has Snippet with step by step explanation of the process and commands I used for this exercise :
- Regarding iPython Notebook, details on introduction and installation steps please follow http://ipython.org/notebook.html
Complete code with the CSV file and iPython notebook is on Github code repo: https://github.com/koolhead17/scripts/tree/001/playin_with_pandas