Weakly-supervised Deep Embedding for Product Review Sentiment Analysis

Weakly-supervised Deep Embedding for Product Review Sentiment Analysis

ABSTRACT:

Product reviews are valuable for upcoming buyers in helping them make decisions. To this end, different opinion mining techniques have been proposed, where judging a review sentence’s orientation (e.g. positive or negative) is one of their key challenges. Recently, deep learning has emerged as an effective means for solving sentiment classification problems. A neural network intrinsically learns a useful representation automatically without human efforts. However, the success of deep learning highly relies on the availability of large-scale training data. We propose a novel deep learning framework for product review sentiment classification which employs prevalently available ratings as weak supervision signals. The framework consists of two steps: (1) learning a high level representation (an embedding space) which captures the general sentiment distribution of sentences through rating information; (2) adding a classification layer on top of the embedding layer and use labeled sentences for supervised fine-tuning. We explore two kinds of low level network structure for modeling review sentences, namely, convolutional feature extractors and long short-term memory. To evaluate the proposed framework, we construct a dataset containing 1.1M weakly labeled review sentences and 11,754 labeled review sentences from Amazon. Experimental results show the efficacy of the proposed framework and its superiority over baselines.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Wei Zhao, Ziyu Guan_ , Long Chen, Xiaofei He, Fellow, IAPR, Deng Cai, Beidou Wang and Quan Wang, “Weakly-supervised Deep Embedding for Product Review Sentiment Analysis”, IEEE Transactions on Knowledge and Data Engineering, 2018.

CompetitiveBike: Competitive Analysis and Popularity Prediction of Bike-Sharing Apps Using Multi-Source Data

CompetitiveBike: Competitive Analysis and Popularity Prediction of Bike-Sharing Apps Using Multi-Source Data

ABSTRACT:

In recent years, bike-sharing systems have been widely deployed in many big cities, which provide an economical and healthy lifestyle. With the prevalence of bike-sharing systems, a lot of companies join the bike-sharing market, leading to increasingly fierce competition. To be competitive, bike-sharing companies and app developers need to make strategic decisions and predict the popularity of bike-sharing apps. However, existing works mostly focus on predicting the popularity of a single app, the popularity contest among different apps has not been explored yet. In this paper, we aim to forecast the popularity contest between Mobike and Ofo, two most popular bike-sharing apps in China. We develop CompetitiveBike, a system to predict the popularity contest among bike-sharing apps leveraging multi-source data. We extract two novel types of features: coarse-grained and fine-grained competitive features, and utilize Random Forest model to forecast the future competitiveness. In addition, we view mobile apps competition as a long-term event and generate the event storyline to enrich our competitive analysis. We collect data about two bike-sharing apps and two food ordering & delivery apps from 11 app stores and Sina Weibo, implement extensive experimental studies, and the results demonstrate the effectiveness and generality of our approach.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Yi Ouyang, Bin Guo, Senior Member, IEEE, Xinjiang Lu, Qi Han, Member, IEEE, Tong Guo, and Zhiwen Yu, Senior Member, IEEE, “CompetitiveBike: Competitive Analysis and Popularity Prediction of Bike-Sharing Apps Using Multi-Source Data”, IEEE Transactions on Mobile Computing, 2018.

Scalable Content-Aware Collaborative Filtering for Location Recommendation

Scalable Content-Aware Collaborative Filtering for Location Recommendation

ABSTRACT:

Location recommendation plays an essential role in helping people find attractive places. Though recent research has studied how to recommend locations with social and geographical information, few of them addressed the cold-start problem of new users. Because mobility records are often shared on social networks, semantic information can be leveraged to tackle this challenge. A typical method is to feed them into explicit-feedback-based content-aware collaborative filtering, but they require drawing negative samples for better learning performance, as users’ negative preference is not observable in human mobility. However, prior studies have empirically shown sampling-based methods do not perform well. To this end, we propose a scalable Implicit-feedback-based Content-aware Collaborative Filtering (ICCF) framework to incorporate semantic content and to steer clear of negative sampling. We then develop an efficient optimization algorithm, scaling linearly with data size and feature size, and quadratically with the dimension of latent space. We further establish its relationship with graph Laplacian regularized matrix factorization. Finally, we evaluate ICCF with a large-scale LBSN dataset in which users have profiles and textual content. The results show that ICCF outperforms several competing baselines, and that user information is not only effective for improving recommendations but also coping with cold-start scenarios.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Defu Lian, Yong Ge, Fuzheng Zhang, Nicholas Jing Yuan, Xing Xie, Tao Zhou, and Yong Rui, “Scalable Content-Aware Collaborative Filtering for Location Recommendation”, IEEE Transactions on Knowledge and Data Engineering, 2018.

Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation

Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation

ABSTRACT:

Personalized recommendation is crucial to help users find pertinent information. It often relies on a large collection of user data, in particular users’ online activity (e.g., tagging/rating/checking-in) on social media, to mine user preference. However, releasing such user activity data makes users vulnerable to inference attacks, as private data (e.g., gender) can often be inferred from the users’ activity data. In this paper, we proposed PrivRank, a customizable and continuous privacy-preserving social media data publishing framework protecting users against inference attacks while enabling personalized ranking-based recommendations. Its key idea is to continuously obfuscate user activity data such that the privacy leakage of user-specified private data is minimized under a given data distortion budget, which bounds the ranking loss incurred from the data obfuscation process in order to preserve the utility of the data for enabling recommendations. An empirical evaluation on both synthetic and real-world datasets shows that our framework can efficiently provide effective and continuous protection of user-specified private data, while still preserving the utility of the obfuscated data for personalized ranking-based recommendation. Compared to state-of-the-art approaches, PrivRank achieves both a better privacy protection and a higher utility in all the ranking-based recommendation use cases we tested.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Dingqi Yang, Bingqing Qu, and Philippe Cudr´e-Mauroux, “Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation”, IEEE Transactions on Knowledge and Data Engineering, 2018.

Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data

Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data

ABSTRACT:

Frequent itemsets mining with differential privacy refers to the problem of mining all frequent itemsets whose supports are above a given threshold in a given transactional dataset, with the constraint that the mined results should not break the privacy of any single transaction. Current solutions for this problem cannot well balance efficiency, privacy, and data utility over large-scale data. Toward this end, we propose an efficient, differential private frequent itemsets mining algorithm over large-scale data. Based on the ideas of sampling and transaction truncation using length constraints, our algorithm reduces the computation intensity, reduces mining sensitivity, and thus improves data utility given a fixed privacy budget. Experimental results show that our algorithm achieves better performance than prior approaches on multiple datasets.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

XINYU XIONG1, FEI CHEN 1,2, PEIZHI HUANG1, MIAOMIAO TIAN3, XIAOFANG HU4, BADONG CHEN 5, AND JING QIN, “Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data”, IEEE ACCESS, 2018.

Characterizing and Predicting Early Reviewers for Effective Product Marketing on E-Commerce Websites

Characterizing and Predicting Early Reviewers for Effective Product Marketing on E-Commerce Websites

ABSTRACT:

Online reviews have become an important source of information for users before making an informed purchase decision. Early reviews of a product tend to have a high impact on the subsequent product sales. In this paper, we take the initiative to study the behavior characteristics of early reviewers through their posted reviews on two real-world large e-commerce platforms, i.e., Amazon and Yelp. In specific, we divide product lifetime into three consecutive stages, namely early, majority and laggards. A user who has posted a review in the early stage is considered as an early reviewer. We quantitatively characterize early reviewers based on their rating behaviors, the helpfulness scores received from others and the correlation of their reviews with product popularity. We have found that (1) an early reviewer tends to assign a higher average rating score; and (2) an early reviewer tends to post more helpful reviews. Our analysis of product reviews also indicates that early reviewers’ ratings and their received helpfulness scores are likely to influence product popularity. By viewing review posting process as a multiplayer competition game, we propose a novel margin-based embedding model for early reviewer prediction. Extensive experiments on two different e-commerce datasets have shown that our proposed approach outperforms a number of competitive baselines.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Ting Bai, Wanye Xin Zhao Member, IEEE, Yulan He Member, IEEE, Jian-Yun Nie Member, IEEE, Ji-Rong Wen Member, IEEE, “Characterizing and Predicting Early Reviewers for Effective Product Marketing on E-Commerce Websites”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018.

Achieving Data Truthfulness and Privacy Preservation in Data Markets

Achieving Data Truthfulness and Privacy Preservation in Data Markets

ABSTRACT:

As a significant business paradigm, many online information platforms have emerged to satisfy society’s needs for person-specific data, where a service provider collects raw data from data contributors, and then offers value-added data services to data consumers. However, in the data trading layer, the data consumers face a pressing problem, i.e., how to verify whether the service provider has truthfully collected and processed data? Furthermore, the data contributors are usually unwilling to reveal their sensitive personal data and real identities to the data consumers. In this paper, we propose TPDM, which efficiently integrates Truthfulness and Privacy preservation in Data Markets. TPDM is structured internally in an Encrypt-then-Sign fashion, using partially homomorphic encryption and identity-based signature. It simultaneously facilitates batch verification, data processing, and outcome verification, while maintaining identity preservation and data confidentiality. We also instantiate TPDM with a profile matching service and a data distribution service, and extensively evaluate their performances on Yahoo! Music ratings dataset and 2009 RECS dataset, respectively. Our analysis and evaluation results reveal that TPDM achieves several desirable properties, while incurring low computation and communication overheads when supporting large-scale data markets.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Chaoyue Niu, Student Member, IEEE, Zhenzhe Zheng, Student Member, IEEE, Fan Wu, Member, IEEE, Xiaofeng Gao, Member, IEEE, and Guihai Chen, Senior Member, IEEE, “Achieving Data Truthfulness and Privacy Preservation in Data Markets”, IEEE Transactions on Knowledge and Data Engineering, 2018.

A General Framework for Implicit and Explicit Social Recommendation

A General Framework for Implicit and Explicit Social Recommendation

ABSTRACT:

Research of social recommendation aims at exploiting social information to improve the quality of a recommender system. It can be further divided into two classes. Explicit social recommendation assumes the existence of not only the users’ ratings on items, but also the explicit social connections between users. Implicit social recommendation assumes the availability of only the ratings but not the social connections between users, and attempts to infer implicit social connections between users with the goal to boost recommendation accuracy. This paper proposes a unified framework that is applicable to both explicit and implicit social recommendation. We propose an optimization framework to learn the degree of social correlation and rating prediction jointly, so these two tasks can mutually boost the performance of each other. Furthermore, a well-known challenge for implicit social recommendation is that it takes quadratic time to learn the strength of pairwise connections. This paper further proposes several practical tricks to reduce the complexity of our model to be linear to the observed ratings. The experiments show that the proposed model, with only two parameters, can significantly outperform the state-of-the-art solutions for both explicit and implicit social recommender systems.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database :         MYSQL

REFERENCE:

Chin-Chi Hsu, Mi-Yen Yeh, Shou-De Lin, “A General Framework for Implicit and Explicit Social Recommendation”, IEEE Transactions on Knowledge and Data Engineering, 2018.

A Comprehensive Study on Social Network Mental Disorders Detection via Online Social Media Mining

A Comprehensive Study on Social Network Mental Disorders Detection via Online Social Media Mining

ABSTRACT:

The explosive growth in popularity of social networking leads to the problematic usage. An increasing number of social network mental disorders (SNMDs), such as Cyber-Relationship Addiction, Information Overload, and Net Compulsion, have been recently noted. Symptoms of these mental disorders are usually observed passively today, resulting in delayed clinical intervention. In this paper, we argue that mining online social behavior provides an opportunity to actively identify SNMDs at an early stage. It is challenging to detect SNMDs because the mental status cannot be directly observed from online social activity logs. Our approach, new and innovative to the practice of SNMD detection, does not rely on self-revealing of those mental factors via questionnaires in Psychology. Instead, we propose a machine learning framework, namely, Social Network Mental Disorder Detection (SNMDD), that exploits features extracted from social network data to accurately identify potential cases of SNMDs. We also exploit multi-source learning in SNMDD and propose a new SNMD-based Tensor Model (STM) to improve the accuracy. To increase the scalability of STM, we further improve the efficiency with performance guarantee. Our framework is evaluated via a user study with 3126 online social network users. We conduct a feature analysis, and also apply SNMDD on large-scale datasets and analyze the characteristics of the three SNMD types. The results manifest that SNMDD is promising for identifying online social network users with potential SNMDs.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database :         MYSQL

REFERENCE:

Hong-Han Shuai, Chih-Ya Shen, De-Nian Yang, Senior Member, IEEE, Yi-Feng Lan, Wang-Chien Lee, Philip S. Yu, Fellow, IEEE and Ming-Syan Chen, Fellow, IEEE, “A Comprehensive Study on Social Network Mental Disorders Detection via Online Social Media Mining”, IEEE Transactions on Knowledge and Data Engineering, 2018.