Transactional Behavior Verification in Business Process as a Service Configuration

Transactional Behavior Verification in Business Process as a Service Configuration

ABSTRACT:

Business Process as a Service (BPaaS) is an emerging type of cloud service that offers configurable and executable business processes to clients over the Internet. As BPaaS is still in early years of research, many open issues remain. Managing the configuration of BPaaS builds on areas such as software product lines and configurable business processes. The problem has concerns to consider from several perspectives, such as the different types of variable features, constraints between configuration options, and satisfying the requirements provided by the client. In our approach, we use temporal logic templates to elicit transactional requirements from clients that the configured service must adhere to. For formalizing constraints over configuration,feature models are used. To manage all these concerns during BPaaS configuration, we develop a structured process that applies formal methods while directing clients through specifying transactional requirements and selecting configurable features.The Binary Decision Diagram (BDD) analysis is then used to verify that the selected configurable features do not violate any constraints. Finally, model checking is applied to verify the configured service against the transactional requirement set. We demonstrate the feasibility of our approach with several validation scenarios and performance evaluations.

EXISTING SYSTEM:

  • Existing approaches in managing business process configuration ensure domain constraints over configuration choices, while allowing basic client requirements such as selected features or control flow variations. One area that has yet to receive research attention is ensuring both domain constraints and client transactional requirements during BPaaS configuration.
  • These requirements can include conditions for acceptable process commit or abortion, required recovery operations for key activities, or valid forms of process compensation, and are difficult to verify in a cloud based scenario where multiple stakeholders are involved.
  • A configuration method that ensures complex requirements within a feasible runtime will be able to provide service clients with increased trust for outsourcing potentially sensitive business operations.

DISADVANTAGES OF EXISTING SYSTEM:

  • The problem has concerns to consider from several perspectives, such as the different types of variable features, constraints between configuration options, and satisfying the requirements provided by the client.

PROPOSED SYSTEM:

  • We propose a three-step configuration and verification process which relies on a modeling paradigm. Such paradigm allows us to capture transactional requirements and subsequently verify them. Our approach is expressive and relatively easy to use by stakeholders, while at the same time being sufficiently rigorous to allow us to apply formal methods for verification.
  • We propose a BPaaS configuration process that applies formal methods to ensure that i) the configuration is valid with respect to provider domain constraints, and ii) the process satisfies transactional requirements drawn from the business rules of the client.
  • First, we provide an overview of the process which guides clients through BPaaS configuration, then we provide details on how Binary Decision Diagram (BDD) analysis and model checking are used at certain steps.

ADVANTAGES OF PROPOSED SYSTEM:

  • To the best of our knowledge, transactional requirements important to clients, such as those supported by our template set, are not yet supported by any business process configuration method, and this is one of the major contributions of this work compared to existing works.
  • This increases client trust that the service will behave in a manner consistent with internal business policies and requirements, without having to perform their own analysis of the service behavior.
  • Our BPaaS model enables configuration from numerous perspectives important to BPaaS clients, namely, activities, resources, and data objects.
  • Our configuration method aims to elicit and ensure complex transactional requirements from clients, by adapting the temporal logic template set.
  • It has the advantage of a reduced runtime when configuring services with many configuration options and values.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Scott Bourne, Claudia Szabo, Member, IEEE, Quan Z. Sheng, Member, IEEE, “Transactional Behavior Verification in Business Process as a Service Configuration”, IEEE TRANSACTIONS ON SERVICE COMPUTING 2017.

Flexible Wildcard Searchable Encryption System

Flexible Wildcard Searchable Encryption System

ABSTRACT:

Searchable encryption is an important technique for public cloud storage service to provide user data confidentialityprotection and at the same time allow users performing keyword search over their encrypted data. Previous schemes only deal withexact or fuzzy keyword search to correct some spelling errors. In this paper, we propose a new wildcard searchable encryption systemto support wildcard keyword queries which has several highly desirable features. First, our system allows multiple keywords searchin which any queried keyword may contain zero, one or two wildcards, and a wildcard may appear in any position of a keyword andrepresent any number of symbols. Second, it supports simultaneous search on multiple data owner’s data using only one trapdoor.Third, it provides flexible user authorization and revocation to effectively manage search and decryption privileges. Fourth, it isconstructed based on homomorphic encryption rather than Bloom filter and hence completely eliminates the false probability causedby Bloom filter. Finally, it achieves a high level of privacy protection since matching results are unknown to the cloud server in the testphase. The proposed system is thoroughly analyzed and is proved secure. Extensive experimental results indicate that our system isefficient compared with other existing wildcard searchable encryption schemes in the public key setting.

EXISTING SYSTEM:

  • Existing wildcard search-able encryption scheme based on Bloom filter, in which each keyword has its own Bloom filter. The storage overhead grows with the number of extracted keywords from the document. The disadvantage of the scheme is that one wildcard can only represent one character.
  • For instance, if a user desires to search all keywords that begin with “acid”, he has to submit the trapdoors for wildcard keywords “acid??”, “acid???” and “acid?????? ??” respectively so that the keywords “acidic”, “acidity” and “acidification” can be matched.
  • To overcome this problem, Hu et al. introduced an improved scheme such that one wildcard can represent any number of characters.
  • Hu’s scheme is constructed based on Suga’s scheme but utilizes a different method to insert a keyword into Bloom filter.

DISADVANTAGES OF EXISTING SYSTEM:

  • The limitation of fuzzy searchable encryption scheme is that only small edit distance errors, such as spelling errors, can be corrected. It is almost useless if the query keyword has a large edit distance from the exact keyword.
  • A serious drawback of Bloom filter based searchable encryption schemes is the inevitability of false probability.

PROPOSED SYSTEM:

  • We propose a flexible wildcard searchable encryption scheme supporting multiple users. It is constructed in public key setting without relying on Bloom filter, is efficient, and achieves high security level. Additionally, when any suspicious action is detected, data owners can dynamically update the verification data stored on the cloud server.
  • Our system is the firstwildcard SE which allows a data user to use onetrapdoor to simultaneously search on multiple dataowner’s files. For example, a medical doctor canissue one wildcard keyword query to simultaneouslysearch over multiple patients’ encrypted EHRs.
  • Moreover, in the search algorithm, the user canuse multiple keyword to generate one trapdoor.These query keywords may contain zero, one ortwo wildcards. The user can issue “AND” or “OR”queries on these keywords and the top-k documentsthat have the highest relevance scores is returned tothe user.

ADVANTAGES OF PROPOSED SYSTEM:

  • No false probability.
  • Flexible user authorization and revocation.
  • Flexible search function.
  • Flexible wildcard representation.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Yang Yang, Member, IEEE, Ximeng Liu, Member, IEEE, Robert H. Deng, Fellow, IEEE, Jian Weng, “Flexible Wildcard Searchable Encryption System”, IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017.

Image Re-ranking based on Topic Diversity

Image Re-ranking based on Topic Diversity

ABSTRACT:

Social media sharing websites allow users to annotate images with free tags, which significantly contribute to the development of the web image retrieval. Tag-based image search is an important method to find images shared by users in social networks. However, how to make the top ranked result relevant and with diversity is challenging. In this paper, we propose a topic diverse ranking approach for tag-based image retrieval with the consideration of promoting the topic coverage performance. First, we construct a tag graph based on the similarity between each tag. Then community detection method is conducted to mine the topic community of each tag. After that, inter-community and intra-community ranking are introduced to obtain the final retrieved results. In the inter-community ranking process, an adaptive random walk model is employed to rank the community based on the multi-information of each topic community. Besides, we build an inverted index structure for images to accelerate the searching process. Experimental results on Flickr dataset and NUS-Wide datasets show the effectiveness of the proposed approach.

EXISTING SYSTEM:

  • Currently, image clustering and duplicate removal are the major approaches in settling the diversity problem. However, most of the literature regards the diversity problem as to promote the visual diversity performance, but the promotion of the semantic coverage is often ignored.
  • To diversify the top ranked search results from the semantic aspect, the topic community belongs to each image should be considered.
  • Dang-Nguyen et al. first propose a clustering algorithm to obtain a topic tree, and then sort topics according to the number of images in the topic. In each cluster, the image uploaded by the user who has highest visual score is selected as the top ranked image. The second image is the one which has the largest distance to the first image. The third image is chosen as the image with the largest distance to both two previous images, and so on.
  • Most papers consider the diversity from visual perspective and achieve it by applying clustering on visual features

DISADVANTAGES OF EXISTING SYSTEM:

  • Tag mismatch
  • Query ambiguity
  • Most of the above literatures view the diversity problem as to promote the visual diversity but not the topic coverage.

PROPOSED SYSTEM:

  • In this paper, we focus on the topic diversity. We first group all the tags in the initial retrieval image list to make the tags with similar semantic be the same cluster, then assign images into different clusters. The images within the same cluster are viewed as the ones with similar semantics. After ranking the clusters and images in each cluster, we select one image from each cluster to achieving our semantic diversity.
  • In this paper, we propose to construct the tag graph and mine the topic community to diversify the semantic information of the retrieval results. The contributions of this paper are summarized as follows:
  • We propose a topic diverse ranking approach considering the topic coverage of the retrieved images. The inter-community ranking method and intra-community ranking methods are proposed to achieve a good trade-off between the diversity and relevance performance.
  • The tag graph construction based on each tag’s word vector and community mining approach are employed in our approach to detect topic community. The mined community can represent each sub-topic under the given query. Besides, in order to represent the relationship of tags better, we train the word vector of each tag based on the English Wikipedia corpus with the model word2vec.
  • We rank each mined community according to their relevance level to the query. In the inter-community ranking process, an adaptive random walk model is employed to accomplish the ranking based on the relevance of each community with respect to the query, pair-wise similarity between each community, and the image number in each community.

ADVANTAGES OF PROPOSED SYSTEM:

  • Good trade-off between the diversity and relevance performance.
  • With the adaptive random walk model, the community that possesses the bigger semantic relevance value with the query and larger confidence value will be ranked higher.
  • To diversify the top ranked retrieval results
  • Computes the similarity between the user-oriented image set and query based on the co-occurrence tag mechanism.
  • We sort the communities based on relevance scores obtained by random walk.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Xueming Qian, Member, IEEE, Dan Lu, Yaxiong Wang, Li Zhu, Yuan Yan Tang, Fellow, IEEE, and Meng Wang, “Image Re-ranking based on Topic Diversity”,  IEEE Transactions on Image Processing, 2017.

A Novel Data Hiding Algorithm for High Dynamic Range Images

A Novel Data Hiding Algorithm for High Dynamic Range Images

ABSTRACT:

In this paper, we propose a novel data hiding algorithm for high dynamic range (HDR) images encoded by the OpenEXR file format. The proposed algorithm exploits each of three 10-bit mantissa fields as an embedding unit in order to conceal k bits of a secret message using an optimal base which produces the least pixel variation. An aggressive bit encoding and decomposition scheme is recommended, which offers a high probability to convey (k+1) bits without increasing pixel variation caused by message concealment. In addition, we present a bit inversion embedding strategy to further increase the capacities when the probability of appearance of secret bit “1” is greater than 0.5. Furthermore, we introduce an adaptive data hiding approach for concealing more secret messages in pixels with low luminance, exploiting the features of the human visual system to achieve luminance-aware adaptive data hiding. The stego HDR images produced by our algorithm coincide with the high dynamic range image file format, causing no suspicion from malicious eavesdroppers. The generated stego HDR images and their tone-mapped low dynamic range (LDR) images reveal no perceptual differences when subjected to quantitative testing by Visual Difference Predictor. Our algorithm can resist steganalytic attacks from the HDR and LDR RS and SPAM steganalyzers. We present the first data hiding algorithm for OpenEXR HDR images offering a high embedding rate and producing high visual quality of the stego images. Our algorithm outperforms the current state-of-the-art works.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • In existing systems, author proposed a distortion-free data embedding scheme for HDR images. Their scheme takes advantage of the Cartesian product of all of the HDR pixels, thus exploiting all of the homogeneous representations.
  • Their method provides an average embedding rate of 0.1355 bpp. And also introduced a new homogeneity index table for homogeneity values of N=3, 5, 6, 7, which efficiently exploits all homogeneous representations of each pixel.
  • Their scheme offers an average embedding rate of 0.1445 bpp.

DISADVANTAGES OF EXISTING SYSTEM:

There are three drawbacks in the existing data hiding algorithm for HDR images.

  • First, while most algorithms target the 32-bit radiance RGBE or 48-bit TIFF format, none of them is developed for the OpenEXR format.
  • Second, while works reported byconstantly increase the embedding capacity, a stego HDR image generated by these algorithms does not preserve the radiance RGBE encoding format, thus becoming perceptible to eavesdroppers and vulnerable to steganalytic attack.
  • Third, most algorithms do not consider how to minimize pixel distortion incurred from message concealment, thus producing a tone-mapped stego image with a moderate image quality. This paper presents a novel data hiding algorithm for HDR images which is detailed in the next section.

PROPOSED SYSTEM:

  • This paper presents a novel data hiding algorithm using optimal base, abbreviated as DHOB, which employs an optimal base to conceal a serial secret bit stream with least distortion in a high dynamic range image encoded by 48-bit OpenEXR file format. This type of HDR image consists of three 16-bit floating-point values in the red, green and blue channels, all of them being “half” data types with 1-bit sign, 5-bit exponent and 10-bit mantissa field.
  • Considering a variety of luminance levels in an HDR image, we propose an adaptive data hiding scheme using  optimal base,  abbreviated as ADHOB, which supports luminance-aware message embedding, where more secret  messages are carried on pixels with low luminance, and vice versa. This scheme exploits the feature of the human visual system since human beings are less sensitive to luminance variation when a pixel has low luminance. 

ADVANTAGES OF PROPOSED SYSTEM:

  • The proposed algorithm takes advantage of 10-bit mantissa fields to convey secret messages, while leaving intact the sign and exponent fields.
  • The proposed algorithm in HDR images encoded by the OpenEXR format capable of providing a variety of capacities and producing high quality stego images feasible for real applications.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :   1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Yun-Te Lin, Chung-Ming Wang*, Member, IEEE, Wei-Sung Chen, Fang-Pang Lin, and Woei Lin, Member, IEEE, “A Novel Data Hiding Algorithm for High Dynamic Range Images”, IEEE Transactions on Multimedia, 2017.

SPFM: Scalable and Privacy-preserving FriendMatching in Mobile Cloud

SPFM: Scalable and Privacy-preserving Friend Matching in Mobile Cloud

ABSTRACT:

Profile (e.g., contact list, interest, mobility) matching is more than important for fostering the wide use of mobile social networks. The social networks such as Facebook, Line or Wechat recommend the friends for the users based on users personal data such as common contact list or mobility traces. However,outsourcing users’ personal information to the cloud for friend matching will raise a serious privacy concern due to the potential risk of data abusing. In this study, we propose a novel Scalable and Privacy-preserving Friend Matching protocol, or SPFM in short, which aims to provide a scalable friend matching and recommendation solutions without revealing the users personal data to the cloud. Different from the previous works which involves multiple rounds of protocols, SPFM presents a scalable solution which can prevent honest-but-curious mobile cloud from obtaining the original data and support the friend matching of  multiple users simultaneously. We give detailed feasibility and security analysis on SPFM and its accuracy and security have been well demonstrated via extensive simulations. The result show that our scheme works even better when original data is large.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • The existing mobile social network systems pay little heed to the privacy concerns associated with friend matching and recommendation based on users’ personal information. For example, in Facebook, it provides the feature of People You May Know, which recommends the friends based on the education information, the contact lists obtained from users’ smartphone, and other users’ personal information.
  • Li et al. applies additive homomorphic encryption in privacy preserving in a scenario with many intermediate computing parties.
  • Narayanan et al. and Dong et al. computes social proximity to discover potential friends by leveraging both homomorphic cryptography and obfuscation, which is more efficient.

DISADVANTAGES OF EXISTING SYSTEM:

  • Outsourcing users’ personal information to the cloud for friend matching will raise a serious privacy concern
  • Existing researches show that loss of privacy can expose users to unwanted advertisement and spams/scams, cause social reputation or economic damage, and make them victims of blackmail or even physical violence
  • The existing works may fail to work in practice due to the following two reasons. Firstly, the best practice in industry for friends recommendation is a multiple-users matching problem rather than a two-party matching problem. Some pre-share parameters between users are more likely to leak. Secondly, most of the existing works involve multiple rounds of protocols, which will suffer from a serious performance challenge.

PROPOSED SYSTEM:

  • In this study, we propose a novel Scalable and Privacy preserving Friend Matching protocol, or SPFM in short, which aims to provide a scalable friend matching and recommendation solutions without revealing the users personal data to the cloud.
  • Our basic motivation is that each user obfuscates every bit of the original personal data (e.g., contact list) before uploading by performing XOR operations with a masking sequence which is generated with a certain probability.
  • We propose a Scalable and Privacy-preserving Friend Matching scheme (SPFM) to prevent privacy leakage in friend matching and recommendation system.

ADVANTAGES OF PROPOSED SYSTEM:

  • Our design can ensure that the same data maintain a statistical similarity after obfuscation while different data can be statistically classified without leaking the original data.
  • We provide a detailed feasibility and security analysis as well as the discussion of correctness, True-Negative rate and True-Positive rate.
  • Extensive evaluations have been performed on SPFM to demonstrate the feasibility and security. The result show that our scheme works even better when original data is large.

SYSTEM ARCHITECTURE: 

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Mengyuan Li, Ruan Na_, QiYang Qian, Haojin Zhu, Xiaohui Liang, Le Yu, “SPFM: Scalable and Privacy-preserving Friend Matching in Mobile Cloud”, IEEE Internet of Things Journal, 2017.

Leveraging Crowdsourcing for Efficient MaliciousUsers Detection in Large-Scale Social Networks

Leveraging Crowdsourcing for Efficient Malicious Users Detection in Large-Scale Social Networks

ABSTRACT:

The past few years have witnessed the dramatic popularity of large-scale social networks where malicious nodes detection is one of the fundamental problems. Most existing works focus on actively detecting malicious nodes by verifying signal correlation or behavior consistency. It may not work well in large-scale social networks since the number of users is extremely large and the difference between normal users and malicious users is inconspicuous. In this paper, we propose a novel approach that leverages the power of users to perform the detection task. We design incentive mechanisms to encourage the participation of users under two scenarios: full information and partial information. In full information scenario, we design a specific incentive scheme for users according to their preferences,which can provide the desirable detection result and minimize overall cost. In partial information scenario, assuming that we only have statistical information about users, we first transformthe incentive mechanism design to an optimization problem, and then design the optimal incentive scheme under different system parameters by solving the optimization problem. We perform extensive simulations to validate the analysis and demonstrate the impact of system factors on the overall cost.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Most existing works are concerned with actively detecting malicious nodes. One approach is focusing on data received by sensors. If some data do not meet the certain criteria such as spatial correlation or frequency correlation, there may be malicious nodes. For example, sensory data in wireless sensor networks are usually location dependent. The malicious nodes can be identified if their reported data are far discrepant from that of nearby sensor nodes.
  • Stringhini et al. focused on the malicious users detectionin social network using big data, studied the users’characteristics and then built a tool to detect spammers.
  • In other existing system the authors used a classifier to identify malicious users.Firstly, the authors collected about 22 thousand users’ dataset from Twitter site and then manually created a label foreach user. Then authors studied the difference between normalusers and malicious users and exploited two parameters (userscore and tweet score) from these data to classify users in thenetwork.

DISADVANTAGES OF EXISTING SYSTEM:

  • Different from the existing systems our work focuses on the scenario where the malicious user can not be easily detected by the system administrator or data correlation of nearly users. And we propose a novel crowdsourcing based approach totackle the malicious users detection problem.The malicious users in social networkshave a terrible impact on the network, in terms of degradingthe network’s performance, reducing the network’s efficiency,increasing the cost or even disabling the whole network.
  • One fundamental issue in crowdsourcing based approachis incentive mechanism design. Since different users havedifferent preferences for these malicious activities, many usersmay choose to stay silent without a proper incentive.
  • Further,malicious users may provide compensation for the victimsto keep them silent. For example, a malicious user maysend an advertisement to user alongside with a coupon ormonetary reward. In such case, incentive provision is criticalto encourage the participation of users.

PROPOSED SYSTEM:

  • In this paper, we propose an approach to detect malicioususers in large-scale social networks from a radical new perspective.The system administrator is not directly participatedin the detection process. Instead, it leverages the power ofnormal users in the social networks to accomplish such adifficult goal, i.e., crowdsourcing the detection tasks to theusers.
  • When malicious users perform abnormalactivities such as cyber attack or advertisement injection, theusers who are the victims of these activities can report themto the system administrator. Obviously, in such a way, thedetection cost for malicious cost can be significantly reducedsince no additional overhead is incurred. Also, the detectionaccuracy can be increased.
  • To existing system issues, we investigate the incentive mechanism toencourage the user participation in the malicious user detectionin a large-scale social network. Interestingly, we consider thatthe malicious users may provide incentives to the normal users when it performs malicious activities (cyber attack,advertisement injection, etc) towards user ui. For example,if a malicious user wants to get users’ profile information,providing some incentives can keep more users silent. Besides,users’ preferences are typically different for maliciousactivities. Some users are more tolerant of advertisementinjection than other users. We adopt contract theory to tackleour problem i.e., we construct contractual arrangements asincentive mechanism for system administrator to encourageusers to help detect the malicious user

ADVANTAGES OF PROPOSED SYSTEM:

  • We introduce a novel, efficient, and effective approach,i.e., crowdsourcing, to detect malicious users in lagerscalesocial networks. Based on this, in order to encouragesufficient users to perform detecting tasks, weformulate the incentive mechanism design problem.
  • We solve the incentive mechanism design problem in twoscenarios: full information of users’ preferences and partialinformation of users’ preferences. In full informationscenario, we design the optimal incentive mechanism byordering users’ preferences. In partial information scenario,assuming that we only have statistical informationabout users’ preferences, we transform this problem to anoptimization problem and solve it by exploring the formof its solution.
  • We perform extensive simulations to illustrate the relationshipbetween the system’ total cost and factors, andvalidate our analysis.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Guang Yang, Student Member, IEEE, Shibo He, Member, IEEE, and Zhiguo Shi, Senior Member, IEEE, “Leveraging Crowdsourcing for Efficient MaliciousUsers Detection in Large-Scale Social Networks”, IEEE INTERNET OF THINGS JOURNAL, 2017.

Search Rank Fraud and Malware Detection in Google Play

Search Rank Fraud and Malware Detection in Google Play

ABSTRACT:

Fraudulent behaviors in Google Play, the most popular Android app market, fuel search rank abuse and malwareproliferation. To identify malware, previous work has focused on app executable and permission analysis. In this paper, we introduceFairPlay, a novel system that discovers and leverages traces left behind by fraudsters, to detect both malware and apps subjected tosearch rank fraud. FairPlay correlates review activities and uniquely combines detected review relations with linguistic and behavioralsignals gleaned from Google Play app data (87K apps, 2.9M reviews, and 2.4M reviewers, collected over half a year), in order toidentify suspicious apps. FairPlay achieves over 95% accuracy in classifying gold standard datasets of malware, fraudulent andlegitimate apps. We show that 75% of the identified malware apps engage in search rank fraud. FairPlay discovers hundreds offraudulent apps that currently evade Google Bouncer’s detection technology. FairPlay also helped the discovery of more than 1,000reviews, reported for 193 apps, that reveal a new type of “coercive” review campaign: users are harassed into writing positive reviews,and install and review other apps.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Google Play uses the Bouncer system to remove malware. However, out of the 7, 756 Google Play apps we analyzed using Virus Total, 12% (948) were flagged by at least one anti-virus tool and 2% (150) were identified as malware by at least 10 tools.
  • Sarma et al. use risk signals extracted from app permissions, e.g., rare critical permissions (RCP) and rare pairs of critical permissions (RPCP), to train SVM and inform users of the risks vs. benefits tradeoffs of apps.
  • Peng et al. propose a score to measure the risk of apps, based on probabilistic generative models such as Naive Bayes.
  • Yerima et al. also use features extracted from app permissions, API calls and commands extracted from the app executables.

DISADVANTAGES OF EXISTING SYSTEM:

  • Previous work has focused on app executable and permission analysis only.
  • Not Efficient
  • Lower percentage of detection rate
  • Takes more time.

PROPOSED SYSTEM:

  • We propose FairPlay, a system that leverages to efficiently detect Google Play fraud and malware. Our major contributions are:
  • To detect fraud and malware, we propose and generate relational, behavioral and linguistic features, that we use to train supervised learning algorithms
  • We formulate the notion of co-review graphs to model reviewing relations between users.
  • We develop PCF, an efficient algorithm to identify temporally constrained, co-review pseudo-cliques — formed by reviewers with substantially overlapping co-reviewing activities across short time windows.
  • We use temporal dimensions of review post times to identify suspicious review spikes received by apps; we show that to compensate for a negative review, for an app that has rating R, a fraudster needs to post at least positive reviews. We also identify apps with “unbalanced” review, rating and install counts, as well as apps with permission request ramps.
  • We use linguistic and behavioral information to (i) detect genuine reviews from which we then (ii) extract user-identified fraud and malware indicators.

ADVANTAGES OF PROPOSED SYSTEM:

  • We build this work on the observation that fraudulent and malicious behaviors leave behind telltale signs on app markets.
  • FairPlay achieves over 97% accuracy in classifying fraudulent and benign apps, and over 95% accuracy in classifying malware and benign apps.
  • FairPlay significantly outperforms the malware indicators of Sarma et al. Furthermore, we show that malware often engages in search rank fraud as well: When trained on fraudulent and benign apps, FairPlay flagged as fraudulent more than 75% of the gold standard malware apps
  • FairPlay discovers hundreds of fraudulent apps.
  • FairPlay also enabled us to discover a novel, coercive review campaign attack type, where app users are harassed into writing a positive review for the app, and install and review other apps

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :

SOFTWARE REQUIREMENTS:

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Mahmudur Rahman, Mizanur Rahman, Bogdan Carbunar, Duen Horng Chau, “Search Rank Fraud and Malware Detection inGoogle Play”, IEEE Transactions on Knowledge and Data Engineering, 2017.

 

 

 

Privacy-Preserving Selective Aggregation ofOnline User Behavior Data

Privacy-Preserving Selective Aggregation ofOnline User Behavior Data

ABSTRACT:

Tons of online user behavior data are being generated every day on the booming and ubiquitous Internet. Growing effortshave been devoted to mining the abundant behavior data to extract valuable information for research purposes or business interests.However, online users’ privacy is thus under the risk of being exposed to third-parties. The last decade has witnessed a body ofresearch works trying to perform data aggregation in a privacy-preserving way. Most of existing methods guarantee strong privacyprotection yet at the cost of very limited aggregation operations, such as allowing only summation, which hardly satisfies the need ofbehavior analysis. In this paper, we propose a scheme PPSA, which encrypts users’ sensitive data to prevent privacy disclosure fromboth outside analysts and the aggregation service provider, and fully supports selective aggregate functions for online user behavioranalysis while guaranteeing differential privacy. We have implemented our method and evaluated its performance using a trace-drivenevaluation based on a real online behavior dataset. Experiment results show that our scheme effectively supports both overallaggregate queries and various selective aggregate queries with acceptable computation and communication overheads.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Jung et al. proposed a system that can perform multivariate polynomial evaluation. Unfortunately, they still do not support selection. However, selective aggregation is one of the most important operations for queries on databases. It can be used to tell the difference among different user groups in a certain aspect. For instance, “select avg(income) from database group by gender”.
  • Chen et al. used an orderpreservinghash-based function to encode both data andqueries instead. But they do not have the same goal asus and cannot evaluate selective aggregation.
  • Li et al.proposed a system that processes range queries, which yetdoes not compute aggregation and assumes analysts to betrusted.
  • Akkus et al. presented the first system that provides web analytics without tracking with differential privacy guarantee, yet it requires a public-key operation per single bit of user data, causing high overheads.

DISADVANTAGES OF EXISTING SYSTEM:

  • Raises serious concerns about disclosure of users’ privacy.
  • To address the conflictbetween the utility of analysis results and users’ privacy,much effort has been devoted to designing protocols thatallow operations on user data while still protecting users’privacy.
  • Unfortunately, existing schemesguarantee strong privacy at the expense of limitations onanalysis.
  • Most of them can only compute summation andmean of data over all users without filter or selection,i.e., overall aggregation.
  • Some previous methods allow morecomplex computation

PROPOSED SYSTEM:

  • The main goal of this paper is to design a practicalprotocol that is able to compute selective aggregation ofuser data while still preserving users’ privacy. There aremainly three challenges.
  • First, the untrusted intermediaryneeds to evaluate selective aggregation obliviously. It cannotaccess user data for privacy concerns, but we hope it doescomputations to achieve selection and aggregation on userdata.We exploit homomorphic cryptosystem to address thischallenge, but so far it does not directly support data selection.
  • Second, our scheme PPSA needs to achieve differentialprivacy in a homomorphic cryptosystem. To protect individuals’privacy, we need to obliviously add noise to aggregateresults in addition to encrypting user data. Existing differentialprivacy mechanism generates noise from real numbers,but homomorphic cryptosystems require plaintexts to beintegers. Simply scaling real numbers to integers wouldcause inaccuracy and inconvenience. Thus, we need to resolvethis conflict.
  • Third, PPSA should be resistant to clientchurn, the situation where clients switch between online andoffline frequently. When an analyst issues a query, therecould be few users connected, which means few data canbe collected to evaluate the query. But the analyst wants theintermediary to respond to her as soon as possible. Thus,our protocol needs to tolerate client churn and evaluate thequery both timely and accurately.

ADVANTAGES OF PROPOSED SYSTEM:

  • We present the first scheme PPSA that allowsprivacy-preserving selective aggregation on userdata, which plays a critical role in online user behavioranalysis.
  • We combine homomorphic encryption and differentialprivacy mechanism to protect users’ sensitiveinformation from both analysts and aggregation serviceproviders, and protect individuals’ privacy frombeing inferred. We prove that differential privacycan be achieved by adding two Geometric variables,which is computed via homomorphic encryption.
  • Furthermore, we present a privacy analysis of PPSA.
  • We extend PPSA to two more scenarios to fullysupport more complex selective aggregation of user data. We utilize a calculation to evaluate aggregationselected by multiple boolean attributes. We design away of oblivious comparison between two integers,and utilize it to evaluate aggregation selected by anumeric attribute.
  • We implement PPSA and do a trace-driven evaluationbased on an online behavior dataset. Evaluationresults show that our scheme effectively supportsvarious selective aggregate queries with high accuracyand acceptable computation and communicationoverheads.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB.

SOFTWARE REQUIREMENTS:

  • Operating system : Windows 7.
  • Coding Language : JAVA/J2EE
  • Tool : Netbeans 7.2.1
  • Database : MYSQL

REFERENCE:

Jianwei Qian, Fudong Qiu, Student Member, IEEE, Fan Wu, Member, IEEE, Na Ruan, Member, IEEE,Guihai Chen, Member, IEEE, and Shaojie Tang, Member, IEEE, “Privacy-Preserving Selective Aggregation ofOnline User Behavior Data” ,IEEETransactions on Computers, 2017.

 

NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media

NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media

ABSTRACT:

Nowadays, a big part of people rely on available contentin social media in their decisions (e.g. reviews and feedbackon a topic or product). The possibility that anybody can leavea review provide a golden opportunity for spammers to writespam reviews about products and services for different interests.Identifying these spammers and the spam content is a hot topicof research and although a considerable number of studies havebeen done recently toward this end, but so far the methodologiesput forth still barely detect spam reviews, and none of themshow the importance of each extracted feature type. In this study,we propose a novel framework, named NetSpam, which utilizesspam features for modeling review datasets as heterogeneousinformation networks to map spam detection procedure into aclassification problem in such networks. Using the importanceof spam features help us to obtain better results in terms ofdifferent metrics experimented on real-world review datasetsfrom Yelp and Amazon websites. The results show that NetSpamoutperforms the existing methods and among four categoriesof features; including review-behavioral, user-behavioral, reviewlinguistic,user-linguistic, the first type of features performs betterthan the other categories.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Existing system techniques can be classified into different categories; some using linguistic patterns in text which are mostly based on bigram, and unigram, others are based on behavioral patterns that rely on features extracted from patterns in users’ behavior which are mostly meta data based and even some techniques using graphs and graph-based algorithms and classifiers.
  • Existing system can be summarized into three categories: Linguistic-based Methods, Behavior-based Methods and Graph-based Methods.
  • Feng et al. use unigram, bigram and their composition. Other studies use other features like pairwise features (features between two reviews; e.g. content similarity), percentage of CAPITAL words in a reviews for finding spam reviews.
  • Lai et al. used a probabilistic language modeling to spot spam. This study demonstrates that 2% of reviews written on business websites are actually spam.
  • Deeper analysis on literature show that behavioral features work better than linguistic ones in term of accuracy they yield.

DISADVANTAGES OF EXISTING SYSTEM:

  • The fact that anyone with any identity can leave comments as review, provides a tempting opportunity for spammers to write fake reviews designed to mislead users’ opinion. These misleading reviews are then multiplied by the sharing function of social media and propagation over the web.
  • Many aspects have been missed or remained unsolved.
  • Previous works also aimed to address the importance of features mainly in term of obtained accuracy, but not as a build-in function in their framework (i.e., their approach is dependent to ground truth for determining each feature importance).

PROPOSED SYSTEM:

  • The general concept of our proposed framework is to model a given review dataset as a Heterogeneous Information Network (HIN) and to map the problem of spam detection into a HIN classification problem.
  • In particular, we model review dataset as a HIN in which reviews are connected through different node types (such as features and users). A weighting algorithm is then employed to calculate each feature’s importance (or weight). These weights are utilized to calculate the final labels for reviews using both unsupervised and supervised approaches.
  • We propose NetSpam framework that is a novel network based approach which models review networks as heterogeneous information networks. The classification step uses different metapath types which are innovative in the spam detection domain.
  • A new weighting method for spam features is proposed to determine the relative importance of each feature and shows how effective each of features are in identifying spams from normal reviews.
  • NetSpam improves the accuracy compared to the state of- the art in terms of time complexity, which highly depends to the number of features used to identify a spam review; hence, using features with more weights will resulted in detecting fake reviews easier with less time complexity.

ADVANTAGES OF PROPOSED SYSTEM:

  • Improved Accuracy
  • Easier in detecting fake reviews
  • Less time Complexity
  • As we explain in our unsupervised approach, NetSpam is able to find features importance even without ground truth, and only by relying on metapath definition and based on values calculated for each review.
  • There is no previous method which engage importance of features (known as weights in our proposed framework; NetSpam) in the classification step. By using these weights, on one hand we involve features importance in calculating final labels and hence accuracy of NetSpam increase, gradually.
  • On the other hand we can determine which feature can provide better performance in term of their involvement in connecting spamreviews (in proposed network).

SYSTEM ARCHITECTURE:

  Admin                                                                                  

 View Productand reviews

Upload Product User

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :

SOFTWARE REQUIREMENTS:

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Saeedreza Shehnepoor, Mostafa Salehi*, Reza Farahbakhsh, Noel Crespi, “NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media”, IEEE Transactions on Information Forensics and Security, 2017.

FastGeo: Efficient Geometric Range Queries onEncrypted Spatial Data

FastGeo: Efficient Geometric Range Queries onEncrypted Spatial Data

ABSTRACT:

Spatial data have wide applications, e.g., location-based services, and geometric range queries (i.e., finding points insidegeometric areas, e.g., circles or polygons) are one of the fundamental search functions over spatial data. The rising demand ofoutsourcing data is moving large-scale datasets, including large-scale spatial datasets, to public clouds. Meanwhile, due to the concernof insider attackers and hackers on public clouds, the privacy of spatial datasets should be cautiously preserved while querying them atthe server side, especially for location-based and medical usage. In this paper, we formalize the concept of Geometrically SearchableEncryption, and propose an efficient scheme, named FastGeo, to protect the privacy of clients’ spatial datasets stored and queried ata public server. With FastGeo, which is a novel two-level search for encrypted spatial data, an honest-but-curious server can efficientlyperform geometric range queries, and correctly return data points that are inside a geometric range to a client without learning sensitivedata points or this private query. FastGeo supports arbitrary geometric areas, achieves sublinear search time, and enables dynamicupdates over encrypted spatial datasets. Our scheme is provably secure, and our experimental results on real-world spatial datasets incloud platform demonstrate that FastGeo can boost search time over 100 times.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Wang et. al. proposed a scheme, which particularly retrieves points inside a circle over encrypted data by using a set of concentric circles.
  • Zhu et al. also built a scheme for circular range search over encrypted spatial data. Unfortunately, these two schemes exclusively work for circles, and do not apply to other geometric areas.
  • Other existing system, leverages Bloom filters and their properties, where a data point is represented as a Bloom filter, a geometric range query is also formed as a Bloom filter, and the result of an inner product of these two Bloom filters correctly indicates whether a point is inside a geometric area. Its advanced version with R-trees can achieve logarithmic search on average.

DISADVANTAGES OF EXISTING SYSTEM:

  • Due to the potential threats of inside attackers and hackers, the privacy of spatial datasets in public clouds should be carefully taken care of, particularly in location-based and medical applications.
  • For instance, a compromise of AWS by an inside attacker or hacker would put millions of Yelp users’ sensitive locations under the spotlight.
  • Does not reveal search decisions (such as inside or outside) over encrypted data, which limits its usage in search.

PROPOSED SYSTEM:

  • In this paper, we formalize the concept of Geometrically Searchable Encryption (GSE), which is evolved from the definitions of SE schemes but focuses on answering geometric queries.
  • We propose a GSE scheme, named FastGeo, which can efficiently retrieve points inside a geometric area without revealing private data points or sensitive geometric range queries to a honest-but curious server.
  • Instead of directly evaluating compute then- compare operations, our main idea is to convert spatial data and geometric range queries to a new form, denoted as equality-vector form, and leverage a two-level search as our key solution to verify whether a point is inside a geometric range, where the first level securely operates equality checking with PRF and the second level privately evaluates inner products with Shen-Shi-Waters encryption (SSW)

ADVANTAGES OF PROPOSED SYSTEM:

  • With the embedding of a hash table and a set of link lists in our two-level search as a novel structure for spatial data, FastGeo can achieve sublinear search and support arbitrary geometric ranges (e.g., circles and polygons).
  • FastGeo not only provides highly efficient updates over encrypted spatial data, but also improves search performance over 100x.
  • We formalize the definition of GSE and its leakage function, and rigorously prove data privacy and query privacy with indistinguishability under selective chosen plaintext attacks.
  • Fast-Geo is highly efficient over a real-world spatial dataset.

SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :

SOFTWARE REQUIREMENTS:

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Boyang Wang, Ming Li, Member, IEEE, and Li Xiong, Member, IEEE, “FastGeo: Efficient Geometric Range Queries onEncrypted Spatial Data”, IEEE Transactions on Dependable and Secure Computing, 2017.