adding fies

a08249eb · MONIKA SROHA · a08249eb · a08249eb · a08249eb · a08249eb
Commit a08249eb authored Nov 27, 2019 by MONIKA SROHA
46 changed files
--- a/Learners_ResearchPapersEasyAccess/documentation/1.png
+++ b/Learners_ResearchPapersEasyAccess/documentation/1.png
--- a/Learners_ResearchPapersEasyAccess/documentation/2.png
+++ b/Learners_ResearchPapersEasyAccess/documentation/2.png
--- a/Learners_ResearchPapersEasyAccess/documentation/3.png
+++ b/Learners_ResearchPapersEasyAccess/documentation/3.png
--- a/Learners_ResearchPapersEasyAccess/documentation/4.png
+++ b/Learners_ResearchPapersEasyAccess/documentation/4.png
--- a/Learners_ResearchPapersEasyAccess/documentation/IIT_Bombay_color_logo.png
+++ b/Learners_ResearchPapersEasyAccess/documentation/IIT_Bombay_color_logo.png
--- a/Learners_ResearchPapersEasyAccess/documentation/developer.pdf
+++ b/Learners_ResearchPapersEasyAccess/documentation/developer.pdf
--- a/Learners_ResearchPapersEasyAccess/documentation/main.tex
+++ b/Learners_ResearchPapersEasyAccess/documentation/main.tex
+\documentclass[12pt]{article}
+\usepackage[english]{babel}
+\usepackage{natbib}
+\usepackage{url}
+\usepackage[utf8x]{inputenc}
+\usepackage{amsmath}
+\usepackage{graphicx}
+\graphicspath{{images/}}
+\usepackage{parskip}
+\usepackage{fancyhdr}
+\usepackage{vmargin}
+\setmarginsrb{3 cm}{2.5 cm}{3 cm}{2.5 cm}{1 cm}{1.5 cm}{1 cm}{1.5 cm}
+\makeatletter
+\let\thetitle\@title
+\let\thedate\@date
+\makeatother
+\pagestyle{fancy}
+\fancyhf{}
+\rhead{\theauthor}
+\lhead{\thetitle}
+\cfoot{\thepage}
+\begin{document}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\begin{titlepage}
+	\centering
+    \vspace*{0.5 cm}
+    \includegraphics[scale = 0.35]{IIT_Bombay_color_logo.png}\\[1.0 cm]	% University Logo
+    \textsc{\Large INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY}\\[1.0 cm]
+    \textsc{\large COMPUTER SCIENCE AND ENGINEERING}\\[0.2 cm] % Branch
+    \textsc{\Large SOFTWARE LAB}\\[0.2 cm]              % Subject
+	\textsc{\Large CS699}\\[0.5 cm]				% Course Code
+	\textsc{\large \textbf{RESEARCH PAPERS EASY ACCESS }}\\[0.2 cm]
+	            % Project Name
+	\rule{\linewidth}{0.2 mm} \\[0.4 cm]
+	\\[1.0 cm]
+	\begin{minipage}{1\textwidth}
+			\begin{flushright} 
+			\emph{\textbf{TEAM NAME: LEARNERS}} \\
+			DEEPTI MITTAL 193050025\linebreak
+			MONIKA SROHA 193050087\linebreak
+			SWARIL SINGHAL 193050039\linebreak
+			% Your Student Number
+		\end{flushright}
+	\end{minipage}\\[2 cm]
+	\vfill
+\end{titlepage}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\tableofcontents
+\pagebreak
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{ABSTRACT}
+ We are generating "word cloud" for Research Papers of professors .The Research papers are extracted from the homepage of professor. The cloud gives greater prominence to words that appear more frequently in the Research Papers. The word-cloud will be clickable, by clicking on any word in the word cloud it will display the list of Research papers on which it appeared.
+\section{INTRODUCTION}
+We are representing Research papers published by a professor in the form of word cloud. Word Cloud is basically an image composed of words used in particular text. Our word cloud will be based on the terms appearing in the research papers published by a professor. These terms will be extracted from the links of research papers provided on the homepage of professor. Words in word cloud are clickable link displaying the list of research papers containing that word.The word cloud will be formed using frequently occurring words in the research papers. The word-cloud assigns weight to the words as per their frequencies in the research paper.
+\section{MOTIVATION}
+Word Clouds are effective tool to represent what is emphasized in your text. This can be used for displaying the field of interest of different professosr. For example, if you construct a word cloud for an Algorithm professor, you can instantly see what words are used more frequently in his research papers. This can help you spot words that perhaps he is focusing on the most or other key areas. Just like an info-graphic and other compelling pictorial representations, they:
+\begin{itemize}
+  \setlength\itemsep{0.5em}
+	\item Make an impact
+	\item Are easy to understand
+	\item Can easily be shared
+	\item ease in handling papers published by the professor
+\end{itemize}
+These Word-Clouds are clickable. On clicking any word in the Word-cloud, it displays the list of Research Papers that contain that word so that the user can find the topics of his/her interest that intersect with professor research interest and from there he/she can redirect to related Research paper directly. 
+\section{PRIOR WORK}
+There is no prior work done in representing the research paper in the form of Word-Cloud and doing word-wise classification of research papers of a professor. Also ease in handling the research paper by a professor.
+\section{FEATURES}
+\begin{itemize}
+  \setlength\itemsep{0.5em}
+  	\item Clickable Word-Cloud classifying Research papers word-wise.
+	\item Each word will display the list of Research papers containing that word.
+	\item Each Research Paper in the list will be a link pointing directly to the Research Paper.  
+	\item Ease in handling papers published by a professor	
+	\item good GUI to handle research papers.
+\end{itemize}
+\pagebreak
+\section{TECHNOLOGIES USED}
+\section*{Django}
+Django is a open source Web framework that encourages rapid development and clean, pragmatic design. Django takes security seriously and helps developers avoid many common security mistakes.
+Some of the busiest sites on the Web leverage Django’s ability to quickly and flexibly scale. It takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel.
+\section*{Python Tools and Libraries}
+Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed to be highly readable. It uses English keywords frequently where as other languages use punctuation, and it has fewer syntactical constructions than other languages.It supports functional and structured programming methods as well as OOPs. It can be used as a scripting language or can be compiled to byte-code for building large applications. It provides very high-level dynamic data types and supports dynamic type checking.It can be easily integrated with C, C++, COM, ActiveX, CORBA, and JavaScript.Libraries used in the project are:
+\\
+\begin{description}
+   \setlength\itemsep{2 em}
+ \item[\textbf{Beautiful Soup}] \hfill \\ Beautiful Soup is used for extracting Research Papers from homepage of professors given its URL.Beautiful soup is basically used for web scrapping. Web scraping is the process of downloading data from websites and extracting valuable information from that data.
+\item[\textbf{PDF-Miner}] \hfill \\ PDF-Miner is a tool for extracting information from PDF documents. Unlike other PDF-related tools, it focuses entirely on getting and analyzing text data. PDF-Miner allows one to obtain the exact location of text in a page, as well as other information such as fonts or lines. It includes a PDF converter that can transform PDF files into other text formats. 
+ \item[\textbf{Pattern.en}] \hfill \\ The pattern.en module contains a fast part-of-speech tagger for English (identifies nouns, adjectives, verbs, etc. in a sentence), sentiment analysis, tools for English verb conjugation and noun singularization & pluralization, and a Word-Net interface. This library helps in selecting valid words from the papers and merging frequency of similar words such as merging plural form of word to its singular form. 
+\end{description}
+\section*{JavaScript, HTML and CSS} \hfill \\JavaScript is used in the frontend. JavaScript is a high level, dynamic, untyped, and interpreted programming language. JavaScript is used for displaying the words, making it clickable and finally showing the links to Research papers.
+\pagebreak
+\vfill
+\section{WORKING PROCESS}
+This is a web application with easy to use interface. Application will ask for the URL of the homepage of professor. On providing the URL, application will extract the link for actual page of the publications.Then, extract all the links for downloadable PDFs  for the given professor by reading the structure of the professor's homepage and thus finally downloading the PDFs in the machine using the extracted links. Frequency counter is applied on each research paper to count the word with maximum frequency and thus finding 50 top most occurring words in the research papers while removing all the useless words such as adjectives, conjunctions, articles, etc termed as stop-words. We are using above 1000 stopwords to extract meaningful words. It will then create a final CSV of the top most 50 words  on the basis of frequency. This CSV is used to generate a word-cloud by assigning weights to words as per their frequencies and display it in the web app. This word-cloud is  clickable. Each word in the word-cloud is a clickable link. On clicking any word in the word-cloud, application will display the list of Research papers of the professor which contain that word in it.\newline
+\pagebreak
+\section{IMPLEMENTATION}
+To run Research Papers Easy Access follow following steps:\\
+\begin{enumerate}
+     \setlength\itemsep{2 em}
+    \item To host server open terminal in the given directory and give command\\
+    \begin{center}
+        \emph{python3 manage.py run server}\\
+        \vspace{0.5in}
+        \hspace*{-0.5in}
+        \centerline{\includegraphics[scale = 0.35]{1.png}}
+        \\[1.0 cm]
+    \end{center}
+    \pagebreak
+    \item Now open any Browser and go to \emph{localhost:8000}. This will display Homepage of Research Paper Easy Access.
+     \vspace{0.5 in}
+     \\
+    \hspace*{-0.5in}
+        \centerline{\includegraphics[scale = 0.35]{2.png}}\\[1.0 cm]
+        \pagebreak
+    \item Enter URL of homepage of a professor. Currently we have implemented it on seven professors.They are:
+    \begin{itemize}
+    \setlength\itemsep{1 em}
+        \begin{center}
+        \item[Pushpak Bhattacharya] : https://www.cse.iitb.ac.in/~pb/
+        \item[Rohit Gurjar] : https://www.cse.iitk.ac.in/users/rgurjar/
+        \item[Varsha Apte] : https://www.cse.iitb.ac.in/~varsha/
+        \item[Om Damani] : https://www.cse.iitb.ac.in/~varsha/
+        \item[Mythili Vutukuru] : https://www.cse.iitb.ac.in/~mythili/
+        \item[Ajit Diwan] : https://www.cse.iitb.ac.in/~aad/
+        \item[Ganesh ramkrishnan ] : https://www.cse.iitb.ac.in/~ganesh/
+           \end{center}
+    \end{itemize}
+    On entering URL you will get a word-cloud of research paper of that professor.
+    \vspace{0.5 in}
+     \\
+    \hspace*{-0.5in}
+        \centerline{\includegraphics[scale = 0.35]{4.png}}\\
+        \caption{\textbf{Word-Cloud of Prof. Rohit Gurjar}}\\[1.0 cm]
+        \vfill
+     \pagebreak
+    \item Now on clicking any word, It will display the links to Research papers containing that word.
+    \vspace{0.5 in}
+     \\
+    \hspace*{-0.5in}
+        \centerline{\includegraphics[scale = 0.35]{3.png}}\\[1.0 cm]
+\end{enumerate}
+\pagebreak
+\section{FUTURE SCOPE}
+We have implemented Research Paper Easy Access on seven professors till now, so implementing it on all the professors will be the next step but some professors are using DBLP for displaying their research paper so for that next we will extract research papers from DBLP links. 
+\section{CONCLUSION}
+The Word-Cloud for research papers is going to be really beneficial from the student's point of view. Earlier, it was very time consuming task to find the professor and his/ her Research papers that intersect with ones research interest or for some academic  work. Thus, it provide easier GUI based web application that it is adaptive and reliable. This application along with reducing the hard work in finding some research paper provide many more advantages such as speedy analysis of professor's research interest, ease in handling the research paper by professor, word-wise classification of research paper.
+\newpage
+\bibliographystyle{plain}
+\bibliography{biblist}
+\section{REFERENCES}
+\begin{itemize}
+    \item Prof. Kavi Arya
+    \item Senior Teacher Assistant Diptesh Kanojia
+    \item www.w3schools.com
+    \item www.cse.iitb.ac.in
+    \item www.codingforum.com
+    \item geeksforgeeks.org
+    \item stackoverflow.com
+\end{itemize}
+\end{document}
--- a/Learners_ResearchPapersEasyAccess/documentation/user.pdf
+++ b/Learners_ResearchPapersEasyAccess/documentation/user.pdf
--- a/Learners_ResearchPapersEasyAccess/readme.txt
+++ b/Learners_ResearchPapersEasyAccess/readme.txt
+GIT Link:https://git.cse.iitb.ac.in/monikasroha/CS699_Research_Paper_Easy_Access
+Project Name: Research Paper Easy Access
+Team Name: Learners
+	Members:
+		Name:Deepti Mittal	Roll No:193050025	
+		Name:Swaril Singhal	Roll No:193050039
+		Name:Monika Sroha	Roll No:193050087
+Motivation: 
+Whenever a student wants to read a paper on the particular topic published by a faculty, he/she has to go through a bunch of paper list to figure out which paper to read. It seems like a hectic thing or somewhat irritating! Research Paper Easy Access will take the URL of faculty as input and display the top 50 words on which faculty has worked the most. On clicking a particular word, list of pdf which contains that word will appear. Now, You can easily access the papers!!
+Hosting from Developer Documentation:
+ 1. Inside source folder, Go to papers folder.
+ 2. Open the terminal.
+ 3. type the command: python3 manage.py runserver
+ 4. Now Open a browser and run localhost:8000.
+ 5. Type URL and word cloud will be generated.
+All of these commands have been specified over the main page of the developer documented pdf provided in developer.pdf.  
--- a/Learners_ResearchPapersEasyAccess/source/papers/manage.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/manage.py
+#!/usr/bin/env python
+"""Django's command-line utility for administrative tasks."""
+import os
+import sys
+## Documentation of main function
+# @brief This function is used to set your settings.py file as default setting and the command is run from command line otherwise raise an error
+# @brief NO paramerts passed
+def main():
+    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'papers.settings')
+    try:
+        from django.core.management import execute_from_command_line
+    except ImportError as exc:
+        raise ImportError(
+            "Couldn't import Django. Are you sure it's installed and "
+            "available on your PYTHONPATH environment variable? Did you "
+            "forget to activate a virtual environment?"
+        ) from exc
+    execute_from_command_line(sys.argv)
+if __name__ == '__main__':
+    main()
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/__init__.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/__init__.py
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/__init__.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/__init__.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/settings.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/settings.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/urls.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/urls.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/wsgi.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/__pycache__/wsgi.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/settings.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/settings.py
+"""
+Django settings for papers project.
+Generated by 'django-admin startproject' using Django 2.2.6.
+For more information on this file, see
+https://docs.djangoproject.com/en/2.2/topics/settings/
+For the full list of settings and their values, see
+https://docs.djangoproject.com/en/2.2/ref/settings/
+"""
+import os
+# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
+## Documentation for the base directory path
+# @brief It specifies the base path of the all files and folders of a project.
+BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+# Quick-start development settings - unsuitable for production
+# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/
+# SECURITY WARNING: keep the secret key used in production secret!
+## Documentation for security key
+# @brief Security key to be kept private when code used in production!!
+SECRET_KEY = '-w-mk58zgjk*y=t61muzzrd+*e)7%+20j__!4_=lh(6^*@)42n'
+# SECURITY WARNING: don't run with debug turned on in production!
+## Documentation for security about DEBUG
+# @brief to show the error status while development. Never use debug as true when used in production.
+DEBUG = True
+ALLOWED_HOSTS = []
+# Application definition
+## To define Installed apps
+# @brief mentioning of all applications used, developer need to add his own project into it. In OUr case we added 'research paper'
+INSTALLED_APPS = [
+    'django.contrib.admin',
+    'django.contrib.auth',
+    'django.contrib.contenttypes',
+    'django.contrib.sessions',
+    'django.contrib.messages',
+    'django.contrib.staticfiles',
+    'research_papers',
+    'crispy_forms'
+]
+CRISPY_TEMPLATE_PACK = 'bootstrap4'
+#'django.contrib.staticfiles'
+## To add the middleware required.
+# @brief for our development No need of any change, it was already initialised.
+MIDDLEWARE = [
+    'django.middleware.security.SecurityMiddleware',
+    'django.contrib.sessions.middleware.SessionMiddleware',
+    'django.middleware.common.CommonMiddleware',
+    'django.middleware.csrf.CsrfViewMiddleware',
+    'django.contrib.auth.middleware.AuthenticationMiddleware',
+    'django.contrib.messages.middleware.MessageMiddleware',
+    'django.middleware.clickjacking.XFrameOptionsMiddleware',
+]
+## Documentation for ROOT_URLconfig
+# @brief to specify the path for urls file where the urlpatterns are being added.
+ROOT_URLCONF = 'papers.urls'
+## Documentation for Templates
+# @brief to specify the templates which has to be set to run the project.
+TEMPLATES = [
+    {
+        'BACKEND': 'django.template.backends.django.DjangoTemplates',
+        'DIRS': [],
+        'APP_DIRS': True,
+        'OPTIONS': {
+            'context_processors': [
+                'django.template.context_processors.debug',
+                'django.template.context_processors.request',
+                'django.contrib.auth.context_processors.auth',
+                'django.contrib.messages.context_processors.messages',
+            ],
+        },
+    },
+]
+WSGI_APPLICATION = 'papers.wsgi.application'
+# Database
+# https://docs.djangoproject.com/en/2.2/ref/settings/#databases
+DATABASES = {
+    'default': {
+        'ENGINE': 'django.db.backends.sqlite3',
+        'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
+    }
+}
+## Password validation
+# @brief It is used for password validation. Not required in Our case.
+# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators
+AUTH_PASSWORD_VALIDATORS = [
+    {
+        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
+    },
+    {
+        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
+    },
+    {
+        'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
+    },
+    {
+        'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
+    },
+]
+# Internationalization
+# https://docs.djangoproject.com/en/2.2/topics/i18n/
+## for specifying the language code
+LANGUAGE_CODE = 'en-us'
+## to specify the different parameters for setting of time zone for server
+TIME_ZONE = 'UTC'
+## to specify the time zone for server
+USE_I18N = True
+## to specify the different parameters for setting of time zone for server
+USE_L10N = True
+## to specify the different parameters for setting of time zone for server
+USE_TZ = True
+# Static files (CSS, JavaScript, Images)
+# https://docs.djangoproject.com/en/2.2/howto/static-files/
+#STATICFILES_DIRS = (os.path.join( os.path.dirname( __file__ ), 'static' ),)
+## Documentation for static file directory path
+# @brief It is set to create a folder in base directory to store all the static files to be used such as images etc.
+STATICFILES_DIRS = (os.path.join( BASE_DIR, 'static' ),)
+## Documentation for static file finders
+# @brief set to search for existence of any static files in project folder to be added
+STATICFILES_FINDERS = (
+'django.contrib.staticfiles.finders.FileSystemFinder',
+'django.contrib.staticfiles.finders.AppDirectoriesFinder',)
+STATIC_URL = '/static/'
+#STATIC_ROOT = '/path/to/copy/files/to'
+#STATIC_ROOT = os.path.join( os.path.dirname( BASE_DIR ), 'static_cdn' )
+MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
+MEDIA_URL = '/media/'
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/Rohit.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/Rohit.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/cs3.jpg
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/cs3.jpg
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/damani.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/damani.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/ganesh.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/ganesh.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/images.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/images.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/pb.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/pb.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/varsha.png
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/static/image/varsha.png
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/urls.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/urls.py
+"""papers URL Configuration
+The `urlpatterns` list routes URLs to views. For more information please see:
+    https://docs.djangoproject.com/en/2.2/topics/http/urls/
+Examples:
+Function views
+    1. Add an import:  from my_app import views
+    2. Add a URL to urlpatterns:  path('', views.home, name='home')
+Class-based views
+    1. Add an import:  from other_app.views import Home
+    2. Add a URL to urlpatterns:  path('', Home.as_view(), name='home')
+Including another URLconf
+    1. Import the include() function: from django.urls import include, path
+    2. Add a URL to urlpatterns:  path('blog/', include('blog.urls'))
+"""
+from django.contrib import admin
+from django.urls import path, include
+from django.conf import settings
+from django.conf.urls.static import static
+urlpatterns = [
+    path('admin/', admin.site.urls),
+    path('', include('research_papers.urls')),
+]
+if settings.DEBUG:
+	urlpatterns += static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)
--- a/Learners_ResearchPapersEasyAccess/source/papers/papers/wsgi.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/papers/wsgi.py
+"""
+WSGI config for papers project.
+It exposes the WSGI callable as a module-level variable named ``application``.
+For more information on this file, see
+https://docs.djangoproject.com/en/2.2/howto/deployment/wsgi/
+"""
+import os
+from django.core.wsgi import get_wsgi_application
+os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'papers.settings')
+application = get_wsgi_application()
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__init__.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__init__.py
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/__init__.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/__init__.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/admin.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/admin.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/forms.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/forms.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/models.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/models.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/urls.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/urls.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/views.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/__pycache__/views.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/admin.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/admin.py
+from django.contrib import admin
+# Register your models here.
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/apps.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/apps.py
+from django.apps import AppConfig
+class ResearchPapersConfig(AppConfig):
+    name = 'research_papers'
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/forms.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/forms.py
+from django import forms
+class MyForm(forms.Form):
+ URL = forms.CharField(label='URL  ', widget=forms.TextInput(attrs={'placeholder': 'Enter URL'}))
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/migrations/__init__.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/migrations/__init__.py
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/migrations/__pycache__/__init__.cpython-36.pyc
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/migrations/__pycache__/__init__.cpython-36.pyc
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/models.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/models.py
+from django.db import models
+# Create your models here.
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/now.html
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/now.html
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <title>Research Papers Easy Access</title>
+</head>
+<body>
+	 URL entered: <strong>{{ URL }}</strong><br><br>
+</body>
+</html>
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/research_papers.html
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/research_papers.html
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <title>Research Papers Easy Access</title>
+    <meta charset="utf-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css">
+    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
+    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js"></script>
+    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js"></script>
+    <style>
+    div{
+    background: transparent;
+    }
+    body
+    {
+    	background-image: url("https://extraconfidencial.com/wp-content/uploads/2019/09/asistentes-virtuales.jpg");
+    	background-position: top;
+    	background-repeat: no-repeat;
+    	background-size: cover; !important
+    	background-color: #cccccc;				
+    }
+    .content
+    {
+    	position: absolute;
+    	background: rgb(0,0,0);
+    	background: rgba(0,0,0,0.5);
+    	color: #f1f1f1;
+    	width: 100%;
+    	padding: 20px;
+    	bottom: 30%;
+    }
+    span{background-color: #cccccc; !important}
+    </style>
+</head>
+<body>
+<center>
+<div class="content">
+<span> 
+  <h1 class="display-3" font="">Research Paper Easy Access</h1>
+<form action="/thankyou/" method="post">
+    {% csrf_token %}
+    <table>
+    {{form.as_table}}
+ </table>
+ <br>
+<input type="submit" class="btn btn-outline-light" value="Submit" />
+</form>
+</span>
+</div>
+</center>
+</body>
+</html>
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/thankyou.html
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/templates/thankyou.html
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <title>Thank You</title>
+</head>
+<body bgcolor = "#d6cbd3">
+	<h2>Response Entered by you:</h2>
+	<form method="post">
+		URL entered: <strong>{{ URL }}</strong><br><br>
+		<script type="text/javascript">
+			var word_list_display  = {{ word_list_display|safe }}
+			var word_count_display  = {{ word_count_display|safe }}
+            		var most_occur_file_map  = {{ most_occur_file_map|safe }}
+            		var pdf_name_list  = {{ pdf_name_list|safe }}
+            		var pdf_url_list  = {{ pdf_url_list|safe }}
+            		var most_occur_file_map_len = {{ most_occur_file_map_len|safe }}
+            		function links(word, size, file, len, pdfname, urls,i) {
+            			var a = document.createElement('a');  
+            			var b = document.createElement('br');
+            			// Create the text node for anchor element. 
+            			var link = document.createTextNode(word); 
+            			// Append the text node to anchor element. 
+            			a.appendChild(link);  
+            			// Set the title. 
+            			a.title = word;
+            			// Set the href property. 
+            			a.href = "javascript:func('"+file+"','"+len+"', '"+pdfname+"','"+urls+"');";  
+	    			// Append the anchor element to the body. 
+				a.style.fontSize = "1.0em";
+				var size1 = (0.002 * size);
+				//document.write(size1)
+				a.style.marginLeft = "100px";
+                		a.style.fontSize = parseFloat(a.style.fontSize) + size1 + "em";
+                		document.body.appendChild(a);
+                		if(i%5==0)
+                			document.body.appendChild(b);
+                	}
+                	function func(file, len, pdfname, urls) {
+                		document.write("<center><b><u>");
+                		document.write("List of PDFs  <br />");
+                		document.write("</u></b></center>");
+                		document.body.style.backgroundColor = "#daebe8";
+                		var file1 = file.split(',');
+                		var pdfname1 = pdfname.split(',');
+                		var urls1 = urls.split(',');
+                		for(i=0; i<len; i++)
+                		{
+                			var a = document.createElement('a');  
+                			var b = document.createElement('br');
+                			// Create the text node for anchor element.
+                			var x =  parseInt(file1[i]);
+                			//document.write(x);
+                			var link = document.createTextNode(pdfname1[x]); 
+                			// Append the text node to anchor element. 
+                			a.appendChild(link);
+                			a.target= "_blank";  
+                			// Set the title. 
+                			a.title = pdfname1[x]; 
+                			// Set the href property. 
+                			a.href = urls1[x];  
+					// Append the anchor element to the body. 
+                			document.body.appendChild(a);
+                			document.body.appendChild(b);
+                		}
+                	}
+			for (i=0;i<{{ x }};i++)
+			{
+				links(word_list_display[i],word_count_display[i], most_occur_file_map[word_list_display[i]], most_occur_file_map_len[word_list_display[i]], pdf_name_list, pdf_url_list,i );
+			}
+	    </script>
+	    </form>
+</body>
+</html>
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/tests.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/tests.py
+from django.test import TestCase
+# Create your tests here.
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/urls.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/urls.py
+from django.urls import path
+from research_papers import views
+#from django.contrib.staticfiles.urls import staticfiles_urlpatterns
+#from django.contrib.staticfiles.urls import staticfiles_urlpatterns
+##Documentation for urlpatterns to specify the URL to be visited.
+# @brief It is used to set the path for setting the pages or adding the pages in the project. it is setting the redirect of the first page to the page ("thankyou page) to result the output.
+urlpatterns = [
+    path('', views.research_papers),
+    path('thankyou/', views.research_papers),
+]
--- a/Learners_ResearchPapersEasyAccess/source/papers/research_papers/views.py
+++ b/Learners_ResearchPapersEasyAccess/source/papers/research_papers/views.py
+## Documentation on How to run the code.
+#
+# To make the code run in your system, python and django should be installed to your system.
+#
+# To install django: 
+#
+#	$ sudo apt-get install django-python
+#           
+#      OR 
+#
+#    $ sudo apt install python3-pip
+#   
+#     AND
+#
+#     $ pip3 install django
+#
+## To install python:
+#
+#     $ sudo apt-get install python
+#
+# Create a directory named 'papers'
+#
+# start a project in terminal by using the following:
+#
+#      $ django -admin startproject research_project
+#
+# A manage.py file will be created.
+#
+# From paper folder to host the server:
+#
+#      $ python3 manage.py runserver
+#
+# The code is in views.py and templates are being created in 'template' folder which is in research_paper.
+#
+# Settings for the setting.py  and manage.py  file:
+#
+#        It has to be done as per specified in the setting.py file and as per requirement.
+#
+# Server is setup. Now you can easily access!!
+##@mainpage
+# @author Learners
+import requests 
+from bs4 import BeautifulSoup 
+import os
+import glob
+from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
+from pdfminer.converter import TextConverter
+from pdfminer.layout import LAParams
+from pdfminer.pdfpage import PDFPage
+from collections import Counter 
+import PyPDF2
+import csv
+from pattern.text.en import singularize, pluralize
+import numpy as np
+from io import StringIO
+from wordcloud import WordCloud, STOPWORDS
+from PIL import Image
+import urllib
+import requests
+import matplotlib.pyplot as plt
+import csv
+import numpy as np
+from django.shortcuts import render
+from research_papers.forms import MyForm
+from django.template import loader
+from django.http import HttpResponse
+##Documentation for wc() function.
+# @brief This function opens a csv file inside it which contains a list of word along with its frequencies.
+# @details It then saves a image of word cloud generated at specified location with different word size. This function is not used by us in current development but if image word cloud is needed, function code can be uncommented.
+def wc():
+	'''reader = csv.reader(open('counts.csv', 'r'))
+	d = {}
+	for k,v in reader:
+		#insert the frequency values in d array 
+		d[k] = float(v)
+	#creates a dummy mask(shape) for wordcloud 
+	#mask = np.array(Image.open("images.jpeg"))
+	#word cloud corresponding to the word whose frequencies are in d array will be generated
+	wordcloud = WordCloud(width=5000, height=6000,max_words=400,stopwords=STOPWORDS,
+random_state=42).generate_from_frequencies(d)
+	wordcloud.to_file('papers/static/image/images.png') #saves the wordcloud to image.png file'''
+##Documentation for convertpdf function.
+# @param path It takes the path of file which has to converted to string . the input file is of pdf type.
+# @brief This function opens the pdf file inside it and convert the file into string
+# @return str which is a string storing the whole file content.
+def convert_pdf(path):
+	rsrcmgr = PDFResourceManager()
+	retstr = StringIO()
+	codec = 'utf-8'
+	laparams = LAParams()
+	device = TextConverter(rsrcmgr, retstr,codec=codec,  laparams=laparams)
+	fp = open(path, 'rb')
+	interpreter = PDFPageInterpreter(rsrcmgr, device)
+	password = ""
+	maxpages = 0
+	caching = True
+	pagenos=set()
+	for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages, password=password,caching=caching, check_extractable=True):
+		interpreter.process_page(page)
+	fp.close()
+	device.close()
+	str = retstr.getvalue()
+	retstr.close()
+	return str
+##Documentation for extractpdf_pb function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Pushpak Bhattacharya).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_pb(soup, add_url):
+	#print(soup.prettify()) 
+	table = soup.find('ul', attrs = {'class':'sub-list'})
+	#print(table)
+	for row in table.findAll('li'):
+		if((row.get_text())=="Yearwise Papers"):
+			#pdfurl['url'] = row.a['href']
+			#pdfurl['type'] = row.a.text 
+			url_significant= row.a['href']	
+	url_significant=add_url + url_significant	
+	print(url_significant)
+	list_pdf=[]
+	req_pdf= requests.get(url_significant)
+	soup = BeautifulSoup(req_pdf.content, 'html5lib') 
+	#print(soup.prettify()) 
+	table = soup.find('ol')
+	url_pdf=[]
+	count=0
+	c=0
+	for row in table.findAll('li'):
+		pdf={}
+		if(row.a==None):
+			c=c+1
+			continue
+		else: 
+			pdf['url'] = add_url + row.a['href']
+			pdf['name'] = row.a.get_text()
+			url_pdf.append(pdf)
+			count=count+1
+	i=0
+	'''for i in range(len(url_pdf)):
+		print(url_pdf[i])
+	(len(url_pdf))
+	print(c)
+	print(count)'''
+	return url_pdf
+##Documentation for extractpdf_varsha function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Varsha Apte).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_varsha(soup, add_url):
+	#print(soup.prettify())
+	table = soup.find('div', attrs = {'id':'menu'}) #to go to research page
+	#print(table)
+	for row in table.findAll('li'):
+		if((row.get_text())=="Research"):
+			url_significant= row.a['href']
+	url_significant= add_url + url_significant
+	#print(url_significant)
+	# to go to publications page
+	req_pdf= requests.get(url_significant) 
+	soup = BeautifulSoup(req_pdf.content, 'html5lib')
+	#table = soup.find('div') 
+	for row in soup.findAll('a'):
+		if((row.get_text())=="Here"):
+			url_significant= row.get('href')
+	url_significant= add_url + url_significant
+	#print(url_significant)
+	#to get the list of pdf
+	list_pdf=[]
+	req_pdf= requests.get(url_significant)
+	soup = BeautifulSoup(req_pdf.content, 'html5lib') 
+	#print(soup.prettify()) 
+	table = soup.find('ol', attrs = {'type':'1'})
+	url_pdf=[]
+	count=0
+	c=0
+	for row in table.findAll('li'):
+		pdf={} #to store the url corresponding to name of pdf
+		if(row.a==None or row.a.get_text()== " "):
+			c=c+1
+			continue
+		else: 
+			pdf['url'] = add_url + row.a['href']
+			pdf['name'] = row.a.get_text()
+			url_pdf.append(pdf)
+			count=count+1
+	i=0
+	return url_pdf
+##Documentation for extractpdf_varsha function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof Ganesh Ramakrishnan).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_ganesh(soup, add_url):
+	#print(soup.prettify()) 
+	table = soup.find('center')
+	#print(table)
+	for row in table.findAll('a'):
+		if((row.b.get_text())=="Publications"):
+			#pdfurl['url'] = row.a['href']
+			#pdfurl['type'] = row.a.text 
+			url_significant= row['href']	
+	url_significant=add_url + url_significant	
+	print(url_significant)
+	list_pdf=[]
+	req_pdf= requests.get(url_significant)
+	soup = BeautifulSoup(req_pdf.content, 'html5lib') 
+	#print(soup.prettify())
+	#exit() 
+	url_pdf=[]
+	count=0
+	c=0
+	for row in soup.findAll('ol'):
+		for x in row.findAll('li'):
+			pdf={}
+			#print(x.i)
+			#print(x.i.a)
+			if(x.a==None or ("Handbook" in x.a.get_text())):
+				c=c+1
+				continue
+			else:
+				if ("http" in x.a['href']):
+					pdf['url']= x.a['href']
+				else:
+					pdf['url'] = add_url + x.a['href']
+				pdf['name'] = x.a.get_text()
+				url_pdf.append(pdf)
+				count=count+1
+	#for i in range(len(url_pdf)):
+		#print(url_pdf[i])
+	#(len(url_pdf))
+	#print(c)
+	#print(count)
+	return url_pdf	
+##Documentation for extractpdf_supratik function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. supratik Bhattacharya).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_supratik(soup, add_url):
+	#print(soup.prettify()) 
+	#table = soup.find('h3')
+	#print(table)
+	publications="\n"+" Publications"
+	for row in soup.findAll('a'):
+		if("Publications" in (row.get_text())):
+			#pdfurl['url'] = row.a['href']
+			#pdfurl['type'] = row.a.text 
+			url_significant= row.get('href')	
+	url_significant=add_url + url_significant	
+	#print(url_significant)
+	list_pdf=[]
+	req_pdf= requests.get(url_significant)
+	soup = BeautifulSoup(req_pdf.content, 'html5lib') 
+	#print(soup.prettify()) 
+	table = soup.find('ol')
+	url_pdf=[]
+	count=0
+	c=0
+	for row in table.findAll('li'):
+		pdf={}
+		if(row.a==None):
+			c=c+1
+			continue
+		else: 
+			pdf['url'] = add_url + row.a['href']
+			pdf['name'] = row.a.get_text()
+			url_pdf.append(pdf)
+			count=count+1
+	i=0
+	#for i in range(len(url_pdf)):
+		#print(url_pdf[i])
+	#(len(url_pdf))
+	#print(c)
+	#print(count)
+	return url_pdf	
+##Documentation for extractpdf_mythili function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Mythili Vutukuru).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_mythili(soup, add_url):
+	table = soup.find('table', attrs = {'cellpadding':'10', 'cellspacing':'0','width':'100%'})
+	url_pdf=[]
+	count=0
+	c=0
+	for row in soup.findAll('ul'):
+		for x in row.findAll('li'):
+			pdf={}
+			if(x.a==None or x.em==None):
+				c=c+1
+				continue
+			else: 
+				pdf['url'] = add_url + x.a['href']
+				#print(pdf['url'])
+				pdf['name'] = x.em.get_text()
+				#print(pdf['name'])
+				url_pdf.append(pdf)
+				count=count+1
+	i=0
+	#for i in range(len(url_pdf)):
+		#print(url_pdf[i])
+	#(len(url_pdf))
+	#print(c)
+	#print(count)
+	return url_pdf	
+##Documentation for extractpdf_rohit function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Rohit Gurjar).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_rohit(soup, add_url):
+	url_pdf=[]
+	count=0
+	c=0
+	for row in soup.findAll('ul'):
+		for x in row.findAll('li'):
+			pdf={}
+			if(x.a==None or x.i==None):
+				c=c+1
+				continue
+			else: 
+				pdf['url'] = add_url + x.a['href']
+				#print(pdf['url'])
+				pdf['name'] = x.a.get_text()
+				#print(pdf['name'])
+				url_pdf.append(pdf)
+				count=count+1
+	i=0
+	#for i in range(len(url_pdf)):
+		#print(url_pdf[i])
+	#(len(url_pdf))
+	#print(c)
+	#print(count)
+	return url_pdf	
+##Documentation for extractpdf_om function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Om Damani).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.	
+def extractpdf_om(soup, add_url):
+	table = soup.find('ul')
+	url_pdf=[]
+	count=0
+	c=0
+	for row in table.findAll('li'):
+			pdf={}
+			if(row.a==None):
+				c=c+1
+				continue
+			else: 
+				if("http" in row.a['href']):
+					continue
+				else:
+					pdf['url'] = add_url + row.a['href']
+				#print(pdf['url'])
+				pdf['name'] = row.a.get_text()
+				#print(pdf['name'])
+				url_pdf.append(pdf)
+				count=count+1
+	#for i in range(len(url_pdf)):
+	#	print(url_pdf[i])
+	#print(c, count)
+	return url_pdf	
+##Documentation for extractpdf_ajit function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param add_url It is the base url of the Professors page
+# @brief This function process the content of html page to extract the name and links of the pdf of research paper publish by the faculty( Prof. Ajit Diwan).
+# @details This function visit to the actual page where papers list has been displayed with the help of beautifulsoup and fetch the paper name along with the url to download the paper. The details for a single paper is stored in a dictionary with key as 'name' and 'url' and form a list of such dictionaries
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL.
+def extractpdf_ajit(soup, add_url):
+	table = soup.find('ol')
+	url_pdf=[]
+	count=0
+	c=0
+	for row in table.findAll('li'):
+			pdf={}
+			if(row.a==None):
+				c=c+1
+				continue
+			else: 
+				if("http" in row.a['href']):
+					continue
+				else:
+					pdf['url'] = add_url + row.a['href']
+				#print(pdf['url'])
+				pdf['name'] = row.a.get_text()
+				#print(pdf['name'])
+				url_pdf.append(pdf)
+				count=count+1
+	#for i in range(len(url_pdf)):
+	#	print(url_pdf[i])
+	#print(c, count)
+	return url_pdf	
+##Documentation for extractpdf_ajit function.
+# @param soup It contains the html page fetched by the request function which is converted to html5lib format using beautiful soup
+# @param URL It is the base url of the Professors page
+# @brief This function figures out the function to be called on the basis of given input URL.
+# @details This function calls the extractpdf function as per professor URL which returns the list of pdf mapped to URLs. If any URL for which no function mapping is yet done is passed, it print error message.
+# @return url_pdf it is the list of dictionary containing the paper name along with its URL	
+def extractpdf_prof(soup, URL):
+	list_pdf=[]
+	if(URL=="https://www.cse.iitb.ac.in/~pb/"):
+		list_pdf=extractpdf_pb(soup, URL)
+	elif(URL=="https://www.cse.iitb.ac.in/~ganesh/"):
+		list_pdf=extractpdf_ganesh(soup, URL)
+	elif(URL=="https://www.cse.iitb.ac.in/~aad/"):
+		list_pdf=extractpdf_ajit(soup, URL)
+	elif(URL=="https://www.cse.iitb.ac.in/~damani/"):
+		list_pdf=extractpdf_om(soup, URL)
+	elif(URL=="https://www.cse.iitb.ac.in/~mythili/"):
+		list_pdf=extractpdf_mythili(soup, URL)
+	elif(URL=="https://www.cse.iitk.ac.in/users/rgurjar/"):
+		list_pdf=extractpdf_rohit(soup, URL)
+	elif(URL=="https://www.cse.iitb.ac.in/~varsha/"):
+		list_pdf=extractpdf_varsha(soup, URL)
+	else:
+		print("You need to define function yet:)")
+		exit()
+	return list_pdf
+##Documentation for savepdf function.
+# @param list_pdf It contains the list of dictionary storing he pdf name mapped to its URL
+# @brief This function will store all the pdf into system for further processing.
+# @details It creates a directory named pdfstore where all the pdf are being stored as pdf0.pdf, pdf1.pdf and do on. 
+# @return It return nothing. display a message saved when all pdf has been downloaded.
+def savepdf(list_pdf):
+	x=len(list_pdf)
+	os.mkdir("pdfstore")
+	for i in range(0,x):
+		file_url=list_pdf[i]['url']
+		r = requests.get(file_url, stream = True)
+		filename= "pdf"+str(i)+".pdf"
+		print(filename)
+		path="pdfstore/"+filename  
+		print(path)
+		with open(path,"wb") as pdf: 
+			for chunk in r.iter_content(chunk_size=1024): 
+				if chunk: 
+					pdf.write(chunk)
+			print("saved")
+##Documentation for research_paper function.
+# @param request this function is passed with request which is used to get the requested posted on page
+# @brief This function process the URL and return the processed words to be displayed inthe form of word cloud.
+# @details On recieving the URL from User it calls the functions to extract the list of pdf links and download them using savpdf function. Afterwards, It will call convertpdf to convert pdf to string and then store it as text file to make word count and the whole procedure is repeated for all the files. then the top 50 occuring words are choosen to be displayed.
+# @return  response It return the response to the page after processing the request in form of word cloud.		
+def research_papers(request):
+ #if form is submitted
+	if request.method == 'POST':
+		myForm = MyForm(request.POST)
+		if myForm.is_valid():
+			URL = myForm.cleaned_data['URL']
+			#URL = "https://www.cse.iitb.ac.in/~pb/"
+			#context={}
+			req_home = requests.get(URL) 
+			soup = BeautifulSoup(req_home.content, 'html5lib')
+			list_pdf= extractpdf_prof(soup, URL)
+			pdf_name_list1=[]
+			pdf_name_list=[]
+			pdf_url_list=[]
+			for i in list_pdf:
+				pdf_name_list1.append(i['name'])
+				pdf_url_list.append(i['url'])
+			for i in pdf_name_list1:
+				z = i.replace(","," ")
+				z2 = z.rstrip()
+				pdf_name_list.append(z2)
+			#print(pdf_name_list)	
+			#print(pdf_url_list)
+			savepdf(list_pdf) # to store pdf
+			#pdflist = glob.glob("/home/monika/CS699/PROJECT/Learners_ResearchPapersEasyAccess/source/papers/pdfstore/*.pdf")
+			pdflist = glob.glob("pdfstore/*.pdf")
+			print(pdflist)
+			wordcount = {}
+			filemap={}
+			for pdf in pdflist:
+				print("Working on: " + pdf + '\n')
+				xlist= list(pdf)
+				ind= xlist[::-1].index("/")
+				ind=len(xlist)-ind
+				pdfname = xlist[ind +3:-4]
+				pdfname = "".join(pdfname)
+				pdfname=int(pdfname)	
+				fout = open('pdfs.txt','w+',encoding="utf-8")
+				try:
+					PyPDF2.PdfFileReader(open(pdf,'rb'))
+				except PyPDF2.utils.PdfReadError:
+					continue
+				else:
+					fout.write(convert_pdf(pdf))
+					fout.close()
+					fin = open('pdfs.txt','r',encoding="utf-8")
+					words = fin.read().lower()
+#print(words)
+					split_it = words.split() 
+# split() returns list of all the words in the string 
+					stopwords = set(line.strip() for line in open('stopwords.txt',encoding="utf-8"))
+# Instantiate a dictionary, and for every word in the file, 
+# Add to the dictionary if it doesn't exist. If it does, increase the count.
+# To eliminate duplicates, remember to split by punctuation, and use case demiliters.
+				for word in split_it:
+					word = word.replace(".","")
+					word = word.replace(",","")
+					word = word.replace(":","")
+					word = word.replace("\"","")
+					word = word.replace("!","")
+					word = word.replace("â€œ","")
+					word = word.replace("â€˜","")
+					word = word.replace("*","")
+					word = word.replace("(","")
+					word = word.replace(")","")
+					word = word.replace("\'","")
+					word = word.replace("-","")
+					if word not in stopwords:
+						if word in wordcount:
+							wordcount[word] += 1
+							if pdfname not in filemap[word]:
+                                                		filemap[word].append(pdfname)
+						elif singularize(word) in wordcount:
+							wordcount[singularize(word)] += 1
+							if pdfname not in filemap[singularize(word)]:
+                                                		filemap[singularize(word)].append(pdfname)
+						elif pluralize(word) in wordcount:
+							wordcount[pluralize(word)] += 1
+							if pdfname not in filemap[pluralize(word)]:
+                                                		filemap[pluralize(word)].append(pdfname)		
+						else:
+							wordcount[word] = 1
+							x=[]
+							x.append(pdfname)
+							filemap[word]=x
+			print("done")   
+			n_print = 50
+			print(n_print)
+			print("\nOK. The {} most common words are as follows\n".format(n_print))
+			cnt = Counter(wordcount)
+#print(cnt)
+# most_common() produces k frequently encountered 
+# input values and their respective counts. 
+			most_occur = cnt.most_common(n_print) 
+# with open("counts.csv", 'w+',encoding="utf-8") as file:
+# 		writer = csv.writer(file)
+# 		#writer.writerow(["word", "count"])
+# 		for i in enumerate(most_occur):
+# 			writer.writerow([i[1][0],i[1][1]])		
+			lst = list()
+			for i in enumerate(most_occur):
+				tmp2=list()
+				tmp2.append(i[1][0])
+				tmp2.append(i[1][1])
+				lst.append(tmp2)
+			np.savetxt("counts.csv",lst, delimiter=',',fmt='%s',encoding="utf-8")	
+			#print(most_occur)
+			#print(most_occur)
+			#print(filemap)
+			most_occur_file_map={}
+			most_occur_file_map_len={}
+			most_occur_dict={}
+			for i in enumerate(most_occur):
+        			most_occur_dict[i[1][0]]= i[1][1]
+        			most_occur_file_map[i[1][0]] = filemap[i[1][0]]
+        			most_occur_file_map_len[i[1][0]] = len(filemap[i[1][0]])
+			#print(most_occur_dict)
+			#most_occur_dict= { 'a':1, 'b':2}
+			word_list_display=[]
+			word_count_display=[]
+			for i in most_occur_dict:
+				word_list_display.append(i)
+				word_count_display.append(most_occur_dict[i])
+			#print(most_occur_file_map)
+			#print(word_list_display)
+			#print(word_count_display)
+			#wc()
+			x  = len(word_list_display)
+			y = len(pdf_name_list)
+			context= {'URL': URL, 'word_list_display': word_list_display,'word_count_display': word_count_display, 'x' : x, 'most_occur_file_map':most_occur_file_map, 'pdf_name_list':pdf_name_list, 'pdf_url_list':pdf_url_list , 'most_occur_file_map_len': most_occur_file_map_len, 'y':y }
+			template = loader.get_template('thankyou.html')
+			#URL = "https://www.cse.iitb.ac.in/~pb/"
+			return HttpResponse(template.render(context, request))
+	else:
+		form = MyForm()
+	return render(request, 'research_papers.html', {'form':form});
--- a/Learners_ResearchPapersEasyAccess/source/papers/stopwords.txt
+++ b/Learners_ResearchPapersEasyAccess/source/papers/stopwords.txt
+a
+able
+about
+above
+abst
+accordance
+according
+accordingly
+across
+act
+actually
+added
+adj
+affected
+affecting
+affects
+after
+afterwards
+again
+against
+ah
+all
+almost
+alone
+along
+already
+also
+although
+always
+am
+among
+amongst
+an
+and
+announce
+another
+any
+anybody
+anyhow
+anymore
+anyone
+anything
+anyway
+anyways
+anywhere
+apparently
+approximately
+are
+aren
+arent
+arise
+around
+as
+aside
+ask
+asking
+at
+auth
+available
+away
+awfully
+b
+back
+be
+became
+because
+become
+becomes
+becoming
+been
+before
+beforehand
+begin
+beginning
+beginnings
+begins
+behind
+being
+believe
+below
+beside
+besides
+between
+beyond
+biol
+both
+brief
+briefly
+but
+by
+c
+ca
+came
+can
+cannot
+can't
+cause
+causes
+certain
+certainly
+co
+com
+come
+comes
+contain
+containing
+contains
+could
+couldnt
+d
+date
+did
+didn't
+different
+do
+does
+doesn't
+doing
+done
+don't
+down
+downwards
+due
+during
+e
+each
+ed
+edu
+effect
+eg
+eight
+eighty
+either
+else
+elsewhere
+end
+ending
+enough
+especially
+et
+et-al
+etc
+even
+ever
+every
+everybody
+everyone
+everything
+everywhere
+ex
+except
+f
+far
+few
+ff
+fifth
+first
+five
+fix
+followed
+following
+follows
+for
+former
+formerly
+forth
+found
+four
+from
+further
+furthermore
+g
+gave
+get
+gets
+getting
+give
+given
+gives
+giving
+go
+goes
+gone
+got
+gotten
+h
+had
+happens
+hardly
+has
+hasn't
+have
+haven't
+having
+he
+hed
+hence
+her
+here
+hereafter
+hereby
+herein
+heres
+hereupon
+hers
+herself
+hes
+hi
+hid
+him
+himself
+his
+hither
+home
+how
+howbeit
+however
+hundred
+i
+id
+ie
+if
+i'll
+im
+immediate
+immediately
+importance
+important
+in
+inc
+indeed
+index
+information
+instead
+into
+invention
+inward
+is
+isn't
+it
+itd
+it'll
+its
+itself
+i've
+j
+just
+k
+keep	keeps
+kept
+kg
+km
+know
+known
+knows
+l
+largely
+last
+lately
+later
+latter
+latterly
+least
+less
+lest
+let
+lets
+like
+liked
+likely
+line
+little
+'ll
+look
+looking
+looks
+ltd
+m
+made
+mainly
+make
+makes
+many
+may
+maybe
+me
+mean
+means
+meantime
+meanwhile
+merely
+mg
+might
+million
+miss
+ml
+more
+moreover
+most
+mostly
+mr
+mrs
+much
+mug
+must
+my
+myself
+n
+na
+name
+namely
+nay
+nd
+near
+nearly
+necessarily
+necessary
+need
+needs
+neither
+never
+nevertheless
+new
+next
+nine
+ninety
+no
+nobody
+non
+none
+nonetheless
+noone
+nor
+normally
+nos
+not
+noted
+nothing
+now
+nowhere
+o
+obtain
+obtained
+obviously
+of
+off
+often
+oh
+ok
+okay
+old
+omitted
+on
+once
+one
+ones
+only
+onto
+or
+ord
+other
+others
+otherwise
+ought
+our
+ours
+ourselves
+out
+outside
+over
+overall
+owing
+own
+p
+page
+pages
+part
+particular
+particularly
+past
+per
+perhaps
+placed
+please
+plus
+poorly
+possible
+possibly
+potentially
+pp
+predominantly
+present
+previously
+primarily
+probably
+promptly
+proud
+provides
+put
+pushpak
+bhattacharya
+q
+que
+quickly
+quite
+qv
+r
+ran
+rather
+rd
+re
+readily
+really
+recent
+recently
+ref
+refs
+regarding
+regardless
+regards
+related
+relatively
+research
+respectively
+resulted
+resulting
+results
+right
+run
+s
+said
+same
+saw
+say
+saying
+says
+sec
+section
+see
+seeing
+seem
+seemed
+seeming
+seems
+seen
+self
+selves
+sent
+sentences
+sentence
+in-
+features
+tion
+seven
+several
+shall
+she
+shed
+she'll
+shes
+should
+shouldn't
+show
+showed
+shown
+showns
+shows
+significant
+significantly
+similar
+similarly
+since
+six
+slightly
+so
+some
+somebody
+somehow
+someone
+somethan
+something
+sometime
+sometimes
+somewhat
+somewhere
+soon
+sorry
+specifically
+specified
+specify
+specifying
+still
+stop
+strongly
+sub
+substantially
+successfully
+such
+sufficiently
+suggest
+sup
+sure	
+t
+take
+taken
+taking
+based
+tell
+tends
+total
+high
+low
+th
+than
+thank
+thanks
+thanx
+that
+that'll
+thats
+that've
+the
+their
+theirs
+them
+themselves
+then
+thence
+there
+thereafter
+thereby
+thered
+therefore
+therein
+there'll
+thereof
+therere
+theres
+thereto
+thereupon
+there've
+these
+they
+theyd
+they'll
+theyre
+they've
+think
+this
+those
+thou
+though
+thoughh
+thousand
+throug
+through
+throughout
+thru
+thus
+til
+tip
+to
+together
+too
+took
+toward
+towards
+tried
+tries
+truly
+try
+trying
+ts
+twice
+two
+u
+un
+under
+unfortunately
+unless
+unlike
+unlikely
+until
+unto
+up
+upon
+ups
+us
+use
+used
+useful
+usefully
+usefulness
+uses
+using
+usually
+v
+value
+various
+'ve
+very
+via
+viz
+vol
+vols
+vs
+w
+want
+wants
+was
+wasnt
+way
+we
+wed
+welcome
+we'll
+went
+were
+werent
+we've
+what
+whatever
+what'll
+whats
+when
+whence
+whenever
+where
+whereafter
+whereas
+whereby
+wherein
+wheres
+whereupon
+wherever
+whether
+which
+while
+whim
+whither
+who
+whod
+whoever
+whole
+who'll
+whom
+whomever
+whos
+whose
+why
+widely
+willing
+wish
+with
+within
+without
+wont
+words
+world
+would
+wouldnt
+www
+x
+y
+yes
+yet
+you
+youd
+you'll
+your
+youre
+yours
+yourself
+yourselves
+you've
+z
+zero
+&
+1
+2
+3
+4
+5
+6
+7
+8
+9
+0
+proposed
+/
+*
+-
+
+.
+!
+@
+#
+$
+%
+^
+(
+)
+{
+}
+[
+]
+\
+'
+"
+;
+:
+<
+>
+?
+_
+=
+'cos
+'d
+'m
+'re
+ago
+ahead
+aren't
+backward
+backwards
+beneath
+couldn't
+despite
+forward
+hadn't
+however	
+I
+inside
+inspite
+mayn't
+mightn't
+mine
+mustn't
+needn't
+oughtn't
+seldom
+shan't
+till
+usedn't
+usen't
+wasn't
+well
+weren't
+will
+won't
+wouldn't
+'s
+'t
+n't
+corporation
+corp
+corp.
+llc
+inc.
+ltd.
+llp
+llp.
+plc
+plc.
+!!
+?!
+??
+!?
+`
+``
+''
+-lrb-
+-rrb-
+-lsb-
+-rsb-
+,
+..
+...
+address
+advance
+advanced
+al
+application
+approach
+area
+aspect
+book
+cant
+century
+circa
+collected
+collection
+compared
+concept
+condition
+conference
+consider
+considered
+consist
+consisted
+contemporary
+contribution
+course
+criticism
+critique
+currently
+das
+de
+demonstrate
+demonstrated
+der
+detail
+development
+die
+difficult
+discuss
+discussed
+early
+easily
+easy
+edited
+edition
+editor
+een
+el
+elementary
+en
+essay
+essential
+evaluation
+example
+exercise
+finally
+foundation
+global
+good
+guide
+handbook
+happen
+happened
+held
+het
+illustrated
+impact
+implication
+including
+influence
+intermediate
+international
+introduce
+introduction
+introductory
+investigating
+investigation
+involving
+issue
+iv
+ix
+journal
+kind
+kingdom
+la
+le
+lecture
+lecture notes
+library
+lo
+main
+making
+manual
+method
+modern
+monograph
+num
+number
+organisation
+original
+outline
+outlined
+overview
+pamphlet
+paper
+paperback
+party
+perspective
+possibility
+practical
+practice
+presented
+previous
+principle
+problem
+proceeding
+process
+project
+publication
+published
+publisher
+reader
+reading
+reference
+relating
+report
+respect
+review
+role
+selected
+seminar
+series
+service
+short
+simple
+source
+sourcebook
+special
+state of the art
+studies
+study
+subject
+suggested
+suitable
+supplement
+survey
+system
+technique
+text
+theme
+theories
+theory
+today
+tomorrow
+topic
+und
+understand
+united
+up to date
+view
+volume
+workbook
+year
+documents
+logs
+2003
+fig
+called
+state
+based
+failure
+processing
+2008
+proof
+corpus
+work
+sentence
+dependent
+performance
+co-occurence
+2010
+character
+relation
+derications
+entry
+rules
+node
+table
+features
+theorem
+function3size
+cases
+proof
+>
+ax
+clearly
+bound
+claim
+fact
+define
+coefficient
+v1
+programs
+definition
+gcid48
+cid48
+defined
+cases
+m2
+0
+22
+basis
+assume
+exists
+find
+smaller
+individual
+version
+or
+idea
+recall
+project
+nonzero
+c1
+x2
+u2
+v3
+construction
+base
+blue
+science
+concentration
+questions
+categories
+m2
--- a/Learners_ResearchPapersEasyAccess/source/sourcereadme.txt
+++ b/Learners_ResearchPapersEasyAccess/source/sourcereadme.txt
+To run the Source:
+ 1. Inside source folder, Go to papers folder.
+ 2. Open the terminal.
+ 3. type the command: python3 manage.py runserver
+ 4. Now Open a browser and run localhost:8000.
+ 5. Type URL and word cloud will be generated.
+Example:
+Open the terminal inside paper folder:
+     python3 manage.py runserver
+Open a new window in browser and run localhost on port 8000.
+In the page , type URL in the given input box.
+Ex: https://www.cse.iitb.ac.in/~varsha/
+A word cloud will be shown on the page. Click on the desired word to display the list of pdf's. to read the desired pdf, click it.