Spammer Detection and Fake User Identification on Social Networks

Spammer Detection and Fake User Identification on Social Networks

ABSTRACT:

Social networking sites engage millions of users around the world. The users’ interactions with these social sites, such as Twitter and Facebook have a tremendous impact and occasionally undesirable repercussions for daily life. The prominent social networking sites have turned into a target platform for the spammers to disperse a huge amount of irrelevant and deleterious information. Twitter, for example, has become one of the most extravagantly used platforms of all times and therefore allows an unreasonable amount of spam. Fake users send undesired tweets to users to promote services or websites that not only affect legitimate users but also disrupt resource consumption. Moreover, the possibility of expanding invalid information to users through fake identities has increased that results in the unrolling of harmful content. Recently, the detection of spammers and identification of fake users on Twitter has become a common area of research in contemporary online social Networks (OSNs). In this paper, we perform a review of techniques used for detecting spammers on Twitter. Moreover, a taxonomy of the Twitter spam detection approaches is presented that classifies the techniques based on their ability to detect: (i) fake content, (ii) spam based on URL, (iii) spam in trending topics, and (iv) fake users. The presented techniques are also compared based on various features, such as user features, content features, graph features, structure features, and time features. We are hopeful that the presented study will be a useful resource for researchers to find the highlights of recent developments in Twitter spam detection on a single platform.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB.

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

FAIZA MASOOD1, GHANA AMMAD1, AHMAD ALMOGREN 2, (Senior Member, IEEE), ASSAD ABBAS 1, HASAN ALI KHATTAK 1, (Senior Member, IEEE), IKRAM UD DIN 3, (Senior Member, IEEE), MOHSEN GUIZANI 4, (Fellow, IEEE), AND MANSOUR ZUAIR5, “Spammer Detection and Fake User Identification on Social Networks”, IEEE Access, 2019.

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data

ABSTRACT:

Cloud computing provides individuals and enterprises massive computing power and scalable storage capacities to support a variety of big data applications in domains like health care and scientific research, therefore more and more data owners are involved to outsource their data on cloud servers for great convenience in data management and mining. However, data sets like health records in electronic documents usually contain sensitive information, which brings about privacy concerns if the documents are released or shared to partially untrusted third-parties in cloud. A practical and widely used technique for data privacy preservation is to encrypt data before outsourcing to the cloud servers, which however reduces the data utility and makes many traditional data analytic operators like keyword-based top-k document retrieval obsolete. In this paper, we investigate the multi-keyword top-k search problem for big data encryption against privacy breaches, and attempt to identify an efficient and secure solution to this problem. Specifically, for the privacy concern of query data, we construct a special tree-based index structure and design a random traversal algorithm, which makes even the same query to produce different visiting paths on the index, and can also maintain the accuracy of queries unchanged under stronger privacy. For improving the query efficiency, we propose a group multi-keyword top-k search scheme based on the idea of partition, where a group of tree-based indexes are constructed for all documents. Finally, we combine these methods together into an efficient and secure approach to address our proposed top-k similarity search. Extensive experimental results on real-life data sets demonstrate that our proposed approach can significantly improve the capability of defending the privacy breaches, the scalability and the time efficiency of query processing over the state-of-the-art methods.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

Xiaofeng Ding, Member, IEEE, Peng Liu and Hai Jin, Senior Member, IEEE, “Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data”, IEEE Transactions on Dependable and Secure Computing, 2019.

Privacy-Preserving Attribute-Based Keyword Search in Shared Multi-owner Setting

Privacy-Preserving Attribute-Based Keyword Search in Shared Multi-owner Setting

ABSTRACT:

Ciphertext-Policy Attribute-Based Keyword Search (CP-ABKS) facilitates search queries and supports fine-grained access control over encrypted data in the cloud. However, prior CP-ABKS schemes were designed to support unshared multi-owner setting, and cannot be directly applied in the shared multi-owner setting (where each record is accredited by a fixed number of data owners), without incurring high computational and storage costs. In addition, due to privacy concerns on access policies, most existing schemes are vulnerable to off-line keyword-guessing attacks if the keyword space is of polynomial size. Furthermore, it is difficult to identify malicious users who leak the secret keys when more than one data user has the same subset of attributes. In this paper, we present a privacy-preserving CP-ABKS system with hidden access policy in Shared Multi-owner setting (basic ABKS-SM system), and demonstrate how it is improved to support malicious user tracing (modified ABKS-SM system). We then prove that the proposed ABKS-SM systems achieve selective security and resist off-line keyword-guessing attack in the generic bilinear group model. We also evaluate their performance using real-world datasets.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB.

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

Yinbin Miao, Ximeng Liu, Kim-Kwang Raymond Choo, Senior Member, IEEE, Robert H. Deng, Fellow, IEEE, Jiguo Li, Hongwei Li, and Jianfeng Ma, “Privacy-Preserving Attribute-Based Keyword Search in Shared Multi-owner Setting”, IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2019.

Enabling Authorized Encrypted Search for Multi-Authority Medical Databases

Enabling Authorized Encrypted Search for Multi-Authority Medical Databases

ABSTRACT:

E-medical records are sensitive and should be stored in a medical database in encrypted form. However, simply encrypting these records will eliminate data utility and interoperability of the existing medical database system because encrypted records are no longer searchable. Moreover, multiple authorities could be involved in controlling and sharing the private medical records of clients. However, authorizing different clients to search and access records originating from multiple authorities in a secure and scalable manner is a nontrivial matter. To address the above issues, we propose an authorized searchable encryption scheme under a multi-authority setting. Specifically, our proposed scheme leverages the RSA function to enable each authority to limit the search capability of different clients based on clients’ privileges. To improve scalability, we utilize multi-authority attribute-based encryption to allow the authorization process to be performed only once even over policies from multiple authorities. We conduct rigorous security and cost analysis, and perform experimental evaluations to demonstrate that the proposed scheme introduces moderate overhead to existing searchable encryption schemes.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

Lei Xu, Shifeng Sun, Xingliang Yuan, Joseph K. Liu, Cong Zuo, Chungen Xu, “Enabling Authorized Encrypted Search for Multi-Authority Medical Databases”, IEEE Transactions on Emerging Topics in Computing, 2019.

 

Authentication by Encrypted Negative Password

Authentication by Encrypted Negative Password

ABSTRACT:

Secure password storage is a vital aspect in systems based on password authentication, which is still the most widely used authentication technique, despite its some security flaws. In this paper, we propose a password authentication framework that is designed for secure password storage and could be easily integrated into existing authentication systems. In our framework, first, the received plain password from a client is hashed through a cryptographic hash function (e.g., SHA-256). Then, the hashed password is converted into a negative password. Finally, the negative password is encrypted into an Encrypted Negative Password (abbreviated as ENP) using a symmetric-key algorithm (e.g., AES), and multi-iteration encryption could be employed to further improve security. The cryptographic hash function and symmetric encryption make it difficult to crack passwords from ENPs. Moreover, there are lots of corresponding ENPs for a given plain password, which makes precomputation attacks (e.g., lookup table attack and rainbow table attack) infeasible. The algorithm complexity analyses and comparisons show that the ENP could resist lookup table attack and provide stronger password protection under dictionary attack. It is worth mentioning that the ENP does not introduce extra elements (e.g., salt); besides this, the ENP could still resist precomputation attacks. Most importantly, the ENP is the first password protection scheme that combines the cryptographic hash function, the negative password and the symmetric-key algorithm, without the need for additional information except the plain password.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB.

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

Wenjian Luo, Senior Member, IEEE, Yamin Hu, Hao Jiang, and Junteng Wang, “Authentication by Encrypted Negative Password”, IEEE Transactions on Information Forensics and Security, Volume: 14 , Issue: 1 , Jan. 2019.

 

A Hybrid E-learning Recommendation Approach Based on Learners’ Influence Propagation

A Hybrid E-learning Recommendation Approach Based on Learners’ Influence Propagation

ABSTRACT:

In e-learning recommender systems, interpersonal information between learners is very scarce, which makes it difficult to apply collaborative filtering (CF) techniques. In this study, we propose a hybrid filtering (HF) recommendation approach (SI-IFL) combining learner influence model (LIM), self-organization based (SOB) recommendation strategy and sequential pattern mining (SPM) together for recommending learning objects (LOs) to learners. The method works as follows: (i), LIM is applied to acquire the interpersonal information by computing the influence that a learner exerts on others. LIM consists of learner similarity, knowledge credibility, and learner aggregation. LIM is independent of ratings. Furthermore, to address the uncertainty and fuzzy natures of learners, intuitionistic fuzzy logic (IFL) is applied to optimize the LIM. (ii), a SOB recommendation strategy is applied to recommend the optimal learner cliques for active learners by simulating the influence propagation among learners. Influence propagation means that a learner can move toward active learners, and such behaviors can stimulate the moving behaviors of his neighbors. This SOB recommendation approach achieves a stable structure based on distributed and bottom-up behaviors of individuals. (iii), SPM is applied to decide the final learning objects (LOs) and navigational paths based on the recommended learner cliques. The experimental results demonstrate that SI-IFL can provide personalized and diversified recommendations, and it shows promising efficiency and adaptability in e-learning scenarios.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB.

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

Shanshan Wan, Zhendong Niu, “A Hybrid E-learning Recommendation Approach Based on Learners’ Influence Propagation”, IEEE Transactions on Knowledge and Data Engineering, 2019.

Mining Users Trust From E-Commerce Reviews Based on Sentiment Similarity Analysis

Mining Users Trust From E-Commerce Reviews Based on Sentiment Similarity Analysis

ABSTRACT:

Consumers’ reviews in E-commerce systems are usually treated as the important resources that reflect user’s experience, feelings, and willingness to purchase items. All this information may involve consumers’ views on things that can express interest, sentiments, and opinions. Many kinds of research have shown that people are more likely to trust each other with the same attitude toward similar things. In this paper, we consider seeking and accepting sentiments and suggestions in E-commerce systems somewhat implies a form of trust between consumers during shopping. Following this view of point, an E-commerce system reviews mining oriented sentiment similarity analysis approach is put forward to exploring users’ similarity and their trust. We divide the trust into two categories, namely direct trust, and propagation of trust, which represents a trust relationship between two individuals. The direct trust degree is obtained from sentiment similarity, and we present an entity-sentiment word pair mining method for similarity feature extraction. The propagation of trust is calculated according to the transitivity feature. Using the proposed trust representation model, we use the shortest path to describe the tightness of trust and put forward an improved shortest path algorithm to figure out the propagation trust relationship between users. A large-scale E-commerce website reviews dataset is collected to examine the accuracy of the algorithms and feasibility of the models. The experimental results indicate that the sentiment similarity analysis can be an efficient method to find trust between users in E-commerce systems.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : PHP
  • Tool : WAMP
  • Database : MYSQL

REFERENCE:

SHAOZHONG ZHANG  AND HAIDONG ZHONG, “Mining Users Trust From E-Commerce Reviews Based on Sentiment Similarity Analysis”, IEEE ACCESS, 2019.

 

Crop Yield Prediction and Efficient use of Fertilizers

Crop Yield Prediction and Efficient use of Fertilizers

ABSTRACT:

India being an agriculture country, its economy predominantly depends on agriculture yield growth and agroindustry products. Data Mining is an emerging research field in crop yield analysis. Yield prediction is a very important issue in agricultural. Any farmer is interested in knowing how much yield he is about to expect. Analyze the various related attributes like location, pH value from which alkalinity of the soil is determined. Along with it, percentage of nutrients like Nitrogen (N), Phosphorous (P), and Potassium (K) Location is used along with the use of third-party applications like APIs for weather and temperature, type of soil, nutrient value of the soil in that region, amount of rainfall in the region, soil composition can be determined. All these attributes of data will be analyzed, train the data with various suitable machine learning algorithms for creating a model. The system comes with a model to be precise and accurate in predicting crop yield and deliver the end user with proper recommendations about required fertilizer ratio based on atmospheric and soil parameters of the land which enhance to increase the crop yield and increase farmer revenue.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

S.Bhanumathi, M.Vineeth and N.Rohit, “Crop Yield Prediction and Efficient use of Fertilizers”, IEEE International Conference on Communication and Signal Processing, April 4-6, 2019.

Spammer Detection and Fake User Identification on Social Networks

Spammer Detection and Fake User Identification on Social Networks

ABSTRACT:

Social networking sites engage millions of users around the world. The users’ interactions with these social sites, such as Twitter and Facebook have a tremendous impact and occasionally undesirable repercussions for the daily life. The prominent social networking sites have turned into a target platform for the spammers to disperse a huge amount of irrelevant and deleterious information. Twitter, for example, has become one of the most extravagantly used platforms of all times and therefore allows an unreasonable amount of spamming. Fake users send undesired tweets to users to promote services or websites that not only affect the legitimate users but also disrupt the resource consumption. Moreover, the possibility of expanding invalid information to users through fake identities has increased that results in the unrolling of harmful content. Recently, the detection of spammers and identification of fake users on Twitter has become a common area of research in contemporary online social Networks (OSNs). In this paper, we perform a review of techniques used for detecting spammers on Twitter. Moreover, a taxonomy of the Twitter spam detection approaches is presented that classifies the techniques based on their ability to detect: (i) fake content, (ii) spam based on URL, (iii) spam in trending topics, and (iv) fake users. The presented techniques are also compared based on various features, such as user features, content features, graph features, structure features, and time features. We are hopeful that the presented study will be a useful resource for researchers to find the highlights of recent developments in Twitter spam detection on a single platform.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

FAIZA MASOOD, GHANA AMMAD, AHMAD ALMOGREN, (SENIOR MEMBER, IEEE), ASSAD ABBAS, HASAN ALI KHATTAK, (SENIOR MEMBER, IEEE), IKRAM UD DIN, (SENIOR MEMBER, IEEE), MOHSEN GUIZANI, (FELLOW, IEEE), AND MANSOUR ZUAIR, “Spammer Detection and Fake User Identification on Social Networks”, IEEE Access, 2019.

Serendipity—A Machine-Learning Application for Mining Serendipitous Drug Usage from Social Media

Serendipity—A Machine-Learning Application for Mining Serendipitous Drug Usage from Social Media

ABSTRACT:

Serendipitous drug usage refers to the unexpected relief of comorbid diseases or symptoms when taking a medication for a different known indication. Historically, serendipity has contributed significantly to identifying many new drug indications. If patient-reported serendipitous drug usage in social media could be computationally identified, it could help generate and validate drug-repositioning hypotheses. We investigated deep neural network models for mining serendipitous drug usage from social media. We used the word2vec algorithm to construct word embedding features from drug reviews posted in a WebMD patient forum. We adapted and redesigned the convolutional neural network, long short-term memory network, and convolutional long short-term memory network by adding contextual information extracted from drug-review posts, information-filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our models with a gold-standard dataset of 15,714 sentences (447 [2.8%] describing serendipitous drug usage). Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. Context information helped reduce the false-positive rate of deep neural network models. If we used an extremely imbalanced dataset with limited instances of serendipitous drug usage, deep neural network models did not outperform other machine-learning models with n-gram and context features. However, deep neural network models could more effectively use word embedding in feature construction, an advantage that makes them worthy of further investigation. Finally, we implemented natural-language processing and machine-learning methods in a web-based application to help scientists and software developers mine social media for serendipitous drug usage.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

Boshu Ru et.al, “Serendipity—A Machine-Learning Application for Mining Serendipitous Drug Usage from Social Media”, IEEE Transactions on NanoBioscience, 2019.