Project Specifications (link)
Samples: presentation, report
Topic: We will work with multiple different datasets such as IMDB top 5000 movies from Kaggle, mine plot keywords and other attributes, so we can answer questions such as:
- Famous people associated with a movie e.g. movies on Abraham Lincon (Famous people dataset from DBpedia).
- Cities/countries in the movie e.g. movies about California.
- Sports associated with the movie e.g. basketball movies.
- Neha Sachdev
- Vasini Chandrasekaran
- Karthik Anantha Prakash
- Sparshith Nairbalige Rai
Topic: Understanding and assess the factors that impacted the “Trump effect” . Probable questions we are trying to answer are:
- Which religious/Ethnic group supports Trump the most?
- City wise percentage of homeless people who voted for Trump.(Visualization)?
- What percentage of users were Facebook user but not Whatsapp user but still supported Trump?
- What percentage of people voted for 'Lower taxes' propaganda of Trump and were in the age group of 24-36?
- How many people from LGBTQ community in the range of 2400-3200 salary supported him and what were their main voting issues?
- Anand, Deepika
- Kochhar, Sakshi
- Ramaswamy, Srividya
- Thuravi Prakash, Roopa
Topic: Social media could be used as a tool to retrieve the emerging situational awareness during disasters. Understanding the impact of a disaster would support the process of disaster response, making decisions, and may be helpful for further disaster prediction/studies. In this work, we propose to use information shared on Twitter (tweets), integrating with spatial data (location), and disaster information crawled from multiple sources (FEMA, USGS, Data.gov, etc.) to analyze, and understand the impact/effect of specific disasters. The study will include analysis, and answers to the below questions:
- Retrieving the trending information about the disasters along time (temporal aspect)
- The effect of the disasters (spatial aspect)
- Summary of tweets relating to the disasters by displaying keywords
- Sentiment analysis relating to the disasters (positive / negative feeling)
- Lu, Guannan
- Nguyen, Minh, Ngoc Binh
- Zhang, Zixuan
- Chen, Dingyuan
Topic: Anime Business Analysis Tool
- Top 10 regions that watch Animes the most.
- What are the most liked Anime for a given location
- Most adored tagged character for a given age rating.
- Highest scored animes for a particular broadcast time.
- Top 5 deviation in user reviews and score for an anime from different sites.
- Finding the top staff(producer, director, studio) based on score, so that you know who to pair with.
Goal: We will scrape multiple Anime websites (https://myanimelist.net/, http://www.anime-planet.com/) to create our website which performs business analysis. So if you are to start a new anime, you need to have full information as to who is your target audience, who you should pair with, who are your competitors etc. There is lot of data across two sites but not structured to answer these business questions. Probable questions we are trying to answer are:
- Kaur, Gurleen
- Lohith, Sachin, Keshavaiah
- Parekh, Disha Yogesh
- Junaid, Mohammed
Topic: Perform analysis on a collection of datasets and create an
RDF ontology for countries around the world
We will be collecting datasets about countries around the world and creating an RDF ontology to show insightful information from the data. We aim to answer questions such as:
- What effect does GDP have on a country's education?
- What effect does population have on a country’s GDP?
- Where is terrorism most dominant?
- Which country has the lowest per capita income?
- PWhich country has low/high low levels of education?
- Which country has lowest per capita income and what are their overall ranges of income?
Our tentative list of topics for pooling into our data repository will primarily be from these different, heterogeneous areas:-
World Population, Education, Income, Budget, Literacy, Crime, Drugs/Health, Trading, Military, Natural Resources, Infrastructure, Currency, Wildlife, Weather
- Jason Christopher Tan
- Georgios Lydakis
- Chetan Yadav
- Manas Mahanta
Sentiment Analysis using Travel Ontology and Social Media Data
Goal: To successfully design and create travel ontology and perform sentiment analysis using the dataset from Twitter, Instagram, and Expedia, hotelbookings.com, Yelp. This will help evaluate popular reviews for any airline, hotels, restaurants or travel destinations/ landmarks.
Results: Sample results could look like below:
- Query on best landmarks leveraging the sentiments (Twitter feeds)
- Restaurants to be visited at location based on the sentiment of the users(travelers)
- Best/Cheap Airlines associated with the locations
- Best time to visit a destination based on people's feedback
- Atharva kale
- Ganmani sekar
- Aman Mathur
- Puneet Koul
Topic: Build a comprehensive source of information about video games, that puts together in one place all the
relevant pieces of data that a gamer needs, from the most popular sources about games (Such as GameSpot,
IGN and MetaCritics). In addition of related material such as Books and Movies (From sources such as:
IMDB and Goodreads) that may extend the experience of the gamer beyond gameplay.
Example of Relevant Data About a Game:
Game Title, Game Description, Game Publisher, Release Date, Platforms Available, GameSpot Review Score, IGN Review Score, Game Spot Review Detail, IGN Review Detail, Users Score in GameSpot, Users Score in IGN, Game Wikis/Guides GameSpot, Game Wikis/Guides IGN, etc. Possible sample questions that can be answered:
- Other Games that might be Related to a specific Game.
- More Games by a certain Publisher.
- All Games that have related Movies (Or vice versa)
- All Games that have related Books (Or vice versa)
- Books that might be Related to a certain Game.
- Books that might be Related to a certain Game.
- Andrade, Juan, Francisco
- Muvva, Upendra Sandeep
- Naik, Chinmay
- Yesmin Kumar, Mehul
Topic: Recommend Youtube videos based off of personality types. We will be using various video preference datasets and a personality dataset to answer the following questions:
Use Case 1:
- Identify personality trait of new users via Twitter and Facebook API
- Use the MBTI dataset and our own classifications from step 1
- Suggest movies to the user according to personality type (at this stage all videos are already mapped to a personality type) Use Case 2:
- Pull new videos from YouTube
- Classify their personality type
- Recommend those videos to people with the outputted personality type
- YouTube Data API
Given a user how likely are they to like a particular video
Given a video who is its target audience
- Bino Joseph
- Nikita Gupta
- Malvika Nagpal
Topic: The key idea of our project is to integrate music data and their releases, ratings, artists, events, albums etc. from several heterogeneous sources such as Spotify, MusicBrainz, Sound Cloud, ITunes etc. This will help to provide a unified view of music industry. The resulting ontology can then be used to answer interesting questions like:
- Which songs have good ratings across different web sites?
- Which are the most popular songs/Albums for a particular artist?
- Which songs/albums have won Grammy awards?
- Are there any concerts/ events in the vicinity for a specific artist?
- Which Genre has the highest number of subscribers/ listeners?
- Borikar, Shubham, Sudhir
- Jain, Eshaan
- Kheberi, Sarabjit Singh, Harwansh Singh
- Aggarwal, Madhur
Topic: Construct a system that provides complete specifications of a cellphone and answer the queries regarding it. The system will be able to answer queries such as:
- Cell phone manufacturer
- Technical specifications
- User reviews and rating
- Details about production plants
- (May implement) data analysis on specific phone brands
- Adarsh Rajanikanth
- Malatesha Somasundar Anantha
- Neelima Vangipuram
- Sahil Wadhwa
Topic: The idea is to create a federated ontology based tool on the films adapted from novels which can serve as a knowledge base for users. Various features are to be analyzed and existing ontology for each feature is to be used to build the framework.
- The movie dataset will be extracted from Dbpedia, and book dataset will be extracted from Book Crossing Dataset from IIF.
- The built model will answer the books related to given movie in the query and vice-versa. Example - Given Movie Name: Notebook will extract the book with Book Name - Notebook.
- Books searched by author name resulting in sorted dataset by rating for the given author name.
- Most popular books read by the given age group.
- Nisha Kapoor
- Yash Rajkumar Kedia
- Shruti Priya
- Nitisha Pandey
Topic: Creating a correlation ontology :
Questions such as :
- Rahul Chandhoke
- Viswambharan Kasturi Rangarajan
- Shruti Anand Kulkarni
- Riya Bharat Punjab
Topic: The main idea of our project is to combine 3 datasets related to food that is :
Recipes, Food Images and Nutrition value of ingredients from Kaggel.
The project captures vital information about the recipes selected such as :
- Aditya Parameshwara
- Charita Venugopal Etta
- Ganesh Madhav Raghupathy
- Xiaoyang Zhang
Topic: We are thinking to work with food data using datasets:
- Ahuja, Swapnil
- Chandurkar, Rushi, Nitin
- Mittal, Saksham
- Porwal, Raj
Topic: We want to create an ontology system that integrate data obtained from databases, e.g. museum websites, or from crawling websites, e.g. painting wikipedia, that contain information about arts, artists, museums. With this system, we can create an engine that allows users to query terms such as artists, time-periods, or country, in which the engine could return museums that have the artist’s paintings on showing or return related paintings with detailed description in a time period. Our purpose for creating this ontology is so that people who are interested in art will have a media in which they can search up information about certain art, where it is displayed, and other relevant information.Members
- YuLong Pei
- Jingjing Wang
- Majid Ghasemi Gol
- Dan Ma
Topic: Working with the below mentioned datasets about movies to visualize and queries like:
- Which movie has best rating among all and in their respective categories.
- Which movies have won most awards
- Which genre has the most rating
- Which genre has won most awards
- Abhilash Natraj
- Rajni Kumari
- Shruthi Kalkunte Narayanaswamy
Topic: The basic idea for the project is that we will collect data from famous software language package management systems (npm for nodejs, nuget for .net, gem for ruby etc), then build ontologies based knowledge base on package's metadata including the package name, package usage (problems they solved), package dependencies, author(s) to discover relationship between different packages even across different languages. With the information from online source code hosting service like GitHub, we can track life cycles of these packages to improve our knowledge base.Members
- Dizheng Wang
- Huiqing Dai
- Sijie Chen
- Tianlei Xu
Topic: Understanding impact of global population, GDP, health, freedom, pollution on world migration:
- Immigration impact with given each of the following - independently or together - GDP, per capita income, government trust, population and happiness index.
- Given the population what is the trust indicator people of that country have in their government
- Impact of population on pollution of countries
- Rasvitha Kandur
- Manikanta Kotthapalli
- Karthik Chindalur Sridhara
- Karthik Ravindra Rao