The Server Provisioning Problem for Continuous Distributed Interactive

The Server Provisioning Problem for Continuous Distributed Interactive

The Server Provisioning Problem for Continuous Distributed Interactive Applications

ABSTRACT:

In this paper, we study the server provisioning problem for continuous Distributed Interactive Applications (DIAs) whose application states not only change because of the operations performed by participants, but also evolve along with the passing of time. We focus on finding the locations of servers for hosting continuous DIAs, with the goals of optimizing the interactivity performance while fulfilling the consistency and fairness requirements. We show that the server provisioning problem is challenging by presenting its NP-hardness and non-approximability results under several conditions. We propose two efficient server placement algorithms and analyze their approximation ratios. The approximation ratio of the proposed M-BETTER algorithm is quite close to a lower bound for any polynomial-time algorithm. We also conduct experimental evaluations to compare the proposed algorithms with several baseline server placements.

PROJECT OUTPUT VIDEO:

EXISTING SYSTEM:

  • Zhang and Tang have proved that the minimum achievable interaction time between clients for fulfilling the consistency and fairness requirements is given by the length of the longest interaction path among all clients.
  • The classical k-center and k-median problems have been strongly advocated for server placement in the Internet. These two problems aim to place k servers in the network to minimize the maximum network latency and the total network latency from the clients to their nearest servers respectively. They well suit the need of web content delivery whose performance is primarily determined by how fast the contents stored on the servers are delivered to the clients.
  • Zhang and Tang studied how to optimize the assignment of clients to servers for DIAs given a set of servers placed.

DISADVANTAGES OF EXISTING SYSTEM:

  • A major barrier to the quality of experience in DIAs is the communication latency across the network.
  • Even with distributed server infrastructures, the network latency cannot be completely eliminated from the interactions between clients in DIAs. The network latency involved in client interactions is directly affected by the locations where servers are placed. Thus, server placement is of crucial importance to the interactivity performance of DIAs.
  • The network latencies do not satisfy the triangle inequality.
  • The locations where servers can be placed are restricted
  • The number of server locations to select is limited

PROPOSED SYSTEM:

  • This paper explores server provisioning for continuous DIAs with consideration of synchronization delays, which lead to a different optimization objective from that for discrete DIAs.
  • In this paper, we formally define the problem of finding the locations of servers for hosting continuous DIAs, with the goal of optimizing the interactivity performance while maintaining the consistency and fairness of DIAs.
  • We further prove that the server provisioning problem cannot be approximated within any bounded factor under condition (i), within a factor of 3=2 under condition (ii), and within a factor of 4=3 under condition (iii). We propose two efficient server placement algorithms and show that they significantly outperform the baseline server placements by means of both theoretical analysis and experimental evaluation. In particular, the proposed M-BETTER algorithm has an approximation ratio quite close to the lower bound of 3=2.

ADVANTAGES OF PROPOSED SYSTEM:

  • From the computability perspective, the difference in the optimization objective gives rise to a much richer set of non-approximability results in this paper than the existing one.
  • We have shown that this is a challenging problem by analyzing its hardness under various conditions.
  • We have proved that the problem cannot be approximated within any constant factor for networks without the triangle inequality; within a factor of 3=2 if the choices of server locations are restricted; and within a factor of 4=3 if there is a limit on the number of server locations to select. We have proposed two server placement algorithms M-GREEDY and M-BETTER with approximation ratios of 2 and 5/3 respectively.

SYSTEM ARCHITECTURE:

the-server-provisioning-problem-for-continuous

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows XP/7.
  • Coding Language : net, C#.net
  • Tool : Visual Studio 2010
  • Database : SQL SERVER 2008

REFERENCE:

Hanying Zheng and Xueyan Tang, Senior Member, IEEE, “The Server Provisioning Problem for Continuous Distributed Interactive Applications”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 27, NO. 1, JANUARY 2016.

Publicly Verifiable Inner Product Evaluation over Outsourced Data Streams under Multiple Keys

Publicly Verifiable Inner Product Evaluation over Outsourced Data Streams under Multiple Keys

Publicly Verifiable Inner Product Evaluation over Outsourced Data Streams under Multiple Keys

ABSTRACT:

Uploading data streams to a resource-rich cloud server for inner product evaluation, an essential building block in many popular stream applications (e.g., statistical monitoring), is appealing to many companies and individuals. On the other hand, verifying the result of the remote computation plays a crucial role in addressing the issue of trust. Since the outsourced data collection likely comes from multiple data sources, it is desired for the system to be able to pinpoint the originator of errors by allotting each data source a unique secret key, which requires the inner product verification to be performed under any two parties’ different keys. However, the present solutions either depend on a single key assumption or powerful yet practically inefficient fully homomorphic cryptosystems. In this paper, we focus on the more challenging multi-key scenario where data streams are uploaded by multiple data sources with distinct keys. We first present a novel homomorphic verifiable tag technique to publicly verify the outsourced inner product computation on the dynamic data streams, and then extend it to support the verification of matrix product computation. We prove the security of our scheme in the random oracle model. Moreover, the experimental result also shows the practicability of our design

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Most of the existing verifiable computation schemes only focus on the single-key setting, i.e., data and its computation are outsourced from merely one contributor or from multiple contributors but with the same key.
  • On the other hand, we may resort to the powerful fully homomorphic encryption (FHE) but are hardly willing to use it in practice due to efficiency concern. As a result, we are still striving to come up with a promising solution in such a challenging multi-key setting..
  • Recently, several works towards public verification either for specific classes of computations or for arbitrary computations have been proposed. However, the outsourced data has to be a priori fixed. Another interesting line of works considered a different setting for verifiable computation.

DISADVANTAGES OF EXISTING SYSTEM:

  • The typical signature algorithm does not serve on purpose of verifiable multi-key computation.
  • These schemes may not be applied to the stream setting since sources lost data control after the outsourcing and thus cannot generate the corresponding secrets for the verification.

PROPOSED SYSTEM:

  • In this paper, we introduce a novel homomorphic verifiable tag technique and design an efficient and publicly verifiable inner product computation scheme on the dynamic outsourced data stream under multiple keys. Our contributions are summarized as follows:
  • To the best of our knowledge, this is the first work that addresses the problem of verifiable delegation of inner product computation over (potentially unbounded) outsourced data streams under the multi-key Specifically, we first present a publicly verifiable group by sum algorithm, which servers as a building block for verifying the inner product of dynamic vectors under two different keys. Then, we extend the construction of the verifiable inner product computation to support matrix product from any two different sources.
  • Our scheme is efficient enough for practical use in terms of communication and computation overhead. Specifically, the size of the proof generated by the server to authenticate the computation result is constant, regardless of the input size n of the evaluated function. In addition, the verification overhead on the client side is constant for inner product querie1. For matrix product query, the verification cost is O(n2) in stark contrast to the super-quadratic computational complexity for matrix product.
  • Our scheme achieves the public verifiability, i.e., a keyless client is able to verify the computation results.
  • We formally define and prove the security of our scheme under the Computational Diffie-Hellman assumption in the random oracle model.

ADVANTAGES OF PROPOSED SYSTEM:

  • Compared with the existing works under the single-key setting, our scheme aims at the more challenging multi-key scenario, i.e., it allows multiple data sources with different secret keys to upload their endless data streams and delegate the corresponding computations to a third party server.
  • Any keyless client is able to publicly verify the validity of the returned computation result.
  • Security analysis shows that our scheme is provable secure under the CDH assumption in the random oracle model.

SYSTEM ARCHITECTURE:

Publicly Verifiable Inner Product Evaluation over Outsourced Data Streams under Multiple Keys

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Xuefeng Liu, Wenhai Sun, Student Member, IEEE, Hanyu Quan, Wenjing Lou, Fellow, IEEE, Yuqing Zhang, Hui Li, Member, IEEE, “Publicly Verifiable Inner Product Evaluation over Outsourced Data Streams under Multiple Keys”, IEEE Transactions on Services Computing 2016.

EPLQ: Efficient Privacy-Preserving Location-Based Query Over Outsourced Encrypted

EPLQ: Efficient Privacy-Preserving Location-Based Query Over Outsourced Encrypted

EPLQ: Efficient Privacy-Preserving Location-Based Query Over Outsourced Encrypted Data

ABSTRACT:

With the pervasiveness of smart phones, location based services (LBS) have received considerable attention and become more popular and vital recently. However, the use of LBS also poses a potential threat to user’s location privacy. In this paper, aiming at spatial range query, a popular LBS providing information about points of interest (POIs) within a given distance, we present an efficient and privacy-preserving location-based query solution, called EPLQ. Specifically, to achieve privacy preserving spatial range query, we propose the first predicate-only encryption scheme for inner product range (IPRE), which can be used to detect whether a position is within a given circular area in a privacy-preserving way. To reduce query latency, we further design a privacy-preserving tree index structure in EPLQ. Detailed security analysis confirms the security properties of EPLQ. In addition, extensive experiments are conducted, and the results demonstrate that EPLQ is very efficient in privacy preserving spatial range query over outsourced encrypted data. In particular, for a mobile LBS user using an Android phone, around 0.9 s is needed to generate a query, and it also only requires a commodity workstation, which plays the role of the cloud in our experiments, a few seconds to search POIs.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Recently, there are already some solutions for privacy preserving spatial range query.
  • Protecting the privacy of user location in LBS has attracted considerable interest. However, significant challenges still remain in the design of privacy-preserving LBS, and new challenges arise particularly due to data outsourcing. In recent years, there is a growing trend of outsourcing data including LBS data because of its financial and operational benefits.
  • Lying at the intersection of mobile computing and cloud computing, designing privacy-preserving outsourced spatial range query faces the challenges.

DISADVANTAGES OF EXISTING SYSTEM:

  • Challenge on querying encrypted LBS data. The LBS provider is not willing to disclose its valuable LBS data to the cloud. The LBS provider encrypts and outsources private LBS data to the cloud, and LBS users query the encrypted data in the cloud. As a result, querying encrypted LBS data without privacy breach is a big challenge, and we need to protect not only the user locations from the LBS provider and cloud but also LBS data from the cloud.
  • Challenge on the resource consumption in mobile devices. Many LBS users are mobile users, and their terminals are smart phones with very limited resources. However, the cryptographic or privacy-enhancing techniques used to realize privacy-preserving query usually result in high computational cost and/or storage cost at user side.
  • Challenge on the efficiency of POI searching. Spatial range query is an online service, and LBS users are sensitive to query latency. To provide good user experiences, the POI search performing at the cloud side must be done in a short time (e.g., a few seconds at most). Again, the techniques used to realize privacy-preserving query usually increase the search latency.
  • Challenge on security. LBS data are about POIs in real world. It is reasonable to assume that the attacker may have some knowledge about original LBS data.With such knowledge, known-sample attacks are possible.

PROPOSED SYSTEM:

  • In this paper, we propose an efficient solution for privacy-preserving spatial range query named EPLQ, which allows queries over encrypted LBS data without disclosing user locations to the cloud or LBS provider.
  • To protect the privacy of user location in EPLQ, we design a novel predicate-only encryption scheme for inner product range (IPRE scheme for short), which, to the best of our knowledge, is the first predicate/predicate-only scheme of this kind. To improve the performance, we also design a privacypreserving index structure named ˆ ss-tree. Specifically, the main contributions of this paper are three folds.
  • We propose IPRE, which allows testing whether the inner product of two vectors is within a given range without disclosing the vectors. In predicate encryption, the key corresponding to a predicate f can decrypt a ciphertext if and only if the attribute of the ciphertext x satisfies the predicate, i.e., f(x) = 1. Predicate-only encryption is a special type of predicate encryption not designed for encrypting/decrypting messages. Instead, it reveals that whether f(x) = 1 or not. Predicate-only encryption schemes supporting different types of predicates have been proposed for privacy-preserving query on outsourced data.
  • We propose EPLQ, an efficient solution for privacy preserving spatial range query. In particular, we show that whether a POI matches a spatial range query or not can be tested by examining whether the inner product of two vectors is in a given range. The two vectors contain the location information of the POI and the query, respectively. Based on this discovery and our IPRE scheme, spatial range query without leaking location information can be achieved. To avoid scanning all POIs to find matched POIs, we further exploit a novel index structure named ˆ ss-tree, which conceals sensitive location information with our IPRE scheme.
  • Our techniques can be used for more kinds of privacypreserving queries over outsourced data. In the spatial range query discussed in this work, we consider Euclidean distance, which is widely used in spatial databases. Our IPRE scheme and ˆ ss-tree may be used for searching records within a given weighted Euclidean distance or great-circle distance as well.Weighted Euclidean distance is used to measure the dissimilarity in many kinds of data, while great-circle distance is the distance of two points on the surface of a sphere.

ADVANTAGES OF PROPOSED SYSTEM:

  • To the best of our knowledge, there does not exist predicate/predicate-only scheme supporting inner product range. Though our scheme is used for privacypreserving spatial range query in this paper, it may be applied in other applications as well.
  • Experiments on our implementation demonstrate that our solution is very efficient.
  • Moreover, security analysis shows that EPLQ is secure under known-sample attacks and ciphertext-only attacks.
  • Using great-circle distance instead of Euclidean distance for long distances on the surface of earth is more accurate. By supporting these two kinds of distances, privacy-preserving similarity query and long spatial range query can also be realized.

SYSTEM ARCHITECTURE:

eplq-efficient-privacy-preserving-location-based-query

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram :1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database :         SQL SERVER 2005

REFERENCE:

Lichun Li, Rongxing Lu, Senior Member, IEEE, and Cheng Huang, “EPLQ: Efficient Privacy-Preserving Location-Based Query Over Outsourced Encrypted Data”, IEEE INTERNET OF THINGS JOURNAL, VOL. 3, NO. 2, APRIL 2016.

Securing SIFT: Privacy-preserving Outsourcing Computation of Feature Extractions over Encrypted Image Data

Securing SIFT: Privacy-preserving Outsourcing Computation of Feature Extractions over Encrypted Image Data

Securing SIFT: Privacy-preserving Outsourcing Computation of Feature Extractions over Encrypted Image Data

ABSTRACT:

Advances in cloud computing have greatly motivated data owners to outsource their huge amount of personal multimedia data and/or computationally expensive tasks onto the cloud by leveraging its abundant resources for cost saving and flexibility. Despite the tremendous benefits, the outsourced multimedia data and its originated applications may reveal the data owner’s private information, such as the personal identity, locations or even financial profiles. This observation has recently aroused new research interest on privacy-preserving computations over outsourced multimedia data. In this paper, we propose an effective and practical privacy-preserving computation outsourcing protocol for the prevailing scale-invariant feature transform (SIFT) over massive encrypted image data. We first show that previous solutions to this problem have either efficiency/security or practicality issues, and none can well preserve the important characteristics of the original SIFT in terms of distinctiveness and robustness. We then present a new scheme design that achieves efficiency and security requirements simultaneously with the preservation of its key characteristics, by randomly splitting the original image data, designing two novel efficient protocols for secure multiplication and comparison, and carefully distributing the feature extraction computations onto two independent cloud servers. We both carefully analyze and extensively evaluate the security and effectiveness of our design. The results show that our solution is practically secure, outperforms the state-of-the art, and performs comparably to the original SIFT in terms of various characteristics, including rotation invariance, image scale invariance, robust matching across affine distortion, addition of noise and change in 3D viewpoint and illumination.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • In the existing literature, efforts on privacy-preserving outsourcing computation have been devoted to various mathematical problems including modular exponentiation, linear equations and kNN search.
  • These works mainly focus on engineering computation problems over numerical data or text data. Only in recent years, privacy-preserving data search in the ciphertext domain has been extended to content-based multimedia retrieval, face recognition and fingerprint identification.
  • The authors in existing explored how to enable secure image search in the data outsourcing environment. Nevertheless, they all assume that the images have been pre-processed by some feature extraction algorithms to obtain their vector representations. Due to the importance of image feature extraction in multimedia data processing and its heavy operations on massive data, especially for satellite data for its tremendous size and large number of feature points, the extraction or detection of image features from the ciphertext domain has began to attract more and more research interest.
  • To the best of our knowledge, Hsu et al. was the first to investigate privacy-preserving SIFT in the encrypted domain by utilizing homomorphic encryption Paillier.
  • Qin et al. presented an improved scheme with the aid of order preserving encryption and random permutation.
  • Wang et al. considered the problem of secure and private outsourcing of shape-based feature extraction and proposed two approaches with different levels of security by using homomorphic encryption and the garbled circuit protocol, respectively.

DISADVANTAGES OF EXISTING SYSTEM:

  • Their solution is either computationally intractable, or otherwise insecure from the privacy perspective.
  • One common limitation of the previous solutions is that they all lack comprehensive analysis and evaluations with respect to the preservation of the key characteristics of the original image feature extraction algorithm.
  • It is still unable to eliminate edge responses such that the detected keypoints are unstable to small amount noise.

PROPOSED SYSTEM:

  • In this paper, we propose an effective and practical privacy-preserving computation outsourcing protocol for the prevailing scale-invariant feature transform (SIFT) over massive encrypted image data.
  • In our design, we first additively and randomly split the image into two shares and distribute them to two semi-honest non-colluding cloud servers. Then we design two secure interactive protocols–batched secure multiplication protocol (BSMP) and batched secure comparison protocol (BSCP) based on somewhat homomorphic encryption (SHE) integrated with the latest batching technique single-instruction multi-data (SIMD).
  • The former one allows the two servers to securely compute the products of multiple pairs of their private inputs simultaneously, while the latter one enables them to compare multiple pairs of private inputs at one time with privacy preservation.
  • Based on the two interactive protocols, we then further develop a new approach to let the two servers collaboratively detect real locations of stable keypoints via the difference-of-Gaussian (DoG) scale space built on encrypted image, and main orientations through computing the orientation range instead of their specific values from the encrypted versions of orientation histogram.
  • Finally, by exploiting the additive property of the encrypted image, the data owner can recover the real feature descriptors from the encrypted descriptors generated by the two servers while preventing them from seeing any information about the original image.

ADVANTAGES OF PROPOSED SYSTEM:

  • We design two novel secure interactive protocols BSMP and BSCP that enable the two servers to compute the products and make comparisons of multiple pairs of integers simultaneously with privacy preservation by using somewhat homomorphic encryption (SHE) and the batching technique SIMD.
  • We for the first time propose a new and effective privacy preserving outsourcing protocol for SIFT with the preservation of its key characteristics, by randomly splitting the original image data, carefully distributing the feature extraction computations to two independent cloud servers based on BSMP and BSCP.
  • We carefully analyze our protocol to show that it can preserve much well the important characteristics of the original SIFT in terms of distinctiveness and robustness as compared to the existing solutions.
  • We also provide both a detailed security analysis and an extensive privacy evaluation to demonstrate the privacy-preserving guarantee of our design.
  • We evaluate our protocol comprehensively with real world massive image datasets. The results show that our solution is practically secure, outperforms the state-of the-art and performs comparably to the original SIFT in terms of various characteristics, including rotation invariance, image scale invariance, robust matching across affine distortion, change in illumination, addition of noise.

SYSTEM ARCHITECTURE:

Securing SIFT: Privacy-preserving Outsourcing Computation of Feature Extractions over Encrypted Image Data

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Shengshan Hu, Qian Wang, Member, IEEE, Jingjun Wang, Zhan Qin, and Kui Ren, Fellow, IEEE, “Securing SIFT: Privacy-preserving Outsourcing Computation of Feature Extractions over Encrypted Image Data” IEEE Transactions on Image Processing, 2016.

Secure Reversible Image Data Hiding over Encrypted Domain via Key Modulation

Secure Reversible Image Data Hiding over Encrypted Domain via Key Modulation

Secure Reversible Image Data Hiding over Encrypted Domain via Key Modulation

ABSTRACT:

This work proposes a novel reversible image data hiding (RIDH) scheme over encrypted domain. The data embedding is achieved through a public key modulation mechanism, in which access to the secret encryption key is not needed. At the decoder side, a powerful two-class SVM classifier is designed to distinguish encrypted and non-encrypted image patches, allowing us to jointly decode the embedded message and the original image signal. Compared with the state-of-the-arts, the proposed approach provides higher embedding capacity, and is able to perfectly reconstruct the original image as well as the embedded message. Extensive experimental results are provided to validate the superior performance of our scheme.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • The majority of the existing RIDH algorithms are designed over the plaintext domain, namely, the message bits are embedded into the original, un-encrypted images.
  • The early works mainly utilized the lossless compression algorithm to compress certain image features, in order to vacate room for message embedding.
  • Histogram shifting (HS)-based technique, initially designed by Ni et al., is another class of approach achieving better embedding performance through shifting the histogram of some image features. The latest difference expansion (DE)-based schemes and the improved prediction error expansion (PEE)-based strategies were shown to be able to offer the state-of-the-art capacity distortion performance.

DISADVANTAGES OF EXISTING SYSTEM:

  • As the source coding with side information at the decoder requires a feedback channel, this scheme would face severe challenges in many practical scenarios, e.g., secure remote sensing, where the feedback channel could be very costly.
  • The embedding capacity of this type of method is rather limited and the incurred distortion on the watermarked image is severe.

PROPOSED SYSTEM:

  • In this work, we propose an encrypted-domain RIDH scheme by specifically taking the above-mentioned design preferences into consideration. The proposed technique embeds message through a public key modulation mechanism, and performs data extraction by exploiting the statistical distinguishability of encrypted and non-encrypted image blocks.
  • Since the decoding of the message bits and the original image is tied together, our proposed technique belongs to the category of non-separable RIDH solutions
  • Compared with the state-of-the-arts, the proposed approach provides higher embedding capacity, and is able to achieve perfect reconstruction of the original image as well as the embedded message bits.
  • Extensive experimental results on test images validate the superior performance of our scheme.

ADVANTAGES OF PROPOSED SYSTEM:

  • Enabling us to jointly decode the embedded message and the original image signal perfectly.
  • We also have performed extensive experiments to validate the superior embedding performance of our proposed RIDH method over encrypted domain.

SYSTEM ARCHITECTURE:

Secure Reversible Image Data Hiding over Encrypted Domain via Key Modulation

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Jiantao Zhou, Member, IEEE, Weiwei Sun, Student Member, IEEE, Li Dong, Student Member, IEEE, Xianming Liu, Member, IEEE, Oscar C. Au, Fellow, IEEE, and Yuan Yan Tang, Fellow, IEEE, “Secure Reversible Image Data Hiding over Encrypted Domain via Key Modulation”, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2016.

Content-Adaptive Steganography by Minimizing Statistical Detectability

Content-Adaptive Steganography by Minimizing Statistical Detectability

Content-Adaptive Steganography by Minimizing Statistical Detectability

ABSTRACT:

Most current steganographic schemes embed the secret payload by minimizing a heuristically defined distortion. Similarly, their security is evaluated empirically using classifiers equipped with rich image models. In this paper, we pursue an alternative approach based on a locally estimated multivariate Gaussian cover image model that is sufficiently simple to derive a closed-form expression for the power of the most powerful detector of content-adaptive least significant bit matching but, at the same time, complex enough to capture the non-stationary character of natural images. We show that when the cover model estimator is properly chosen, the state-of-the-art performance can be obtained. The closed-form expression for detectability within the chosen model is used to obtain new fundamental insight regarding the performance limits of empirical steganalysis detectors built as classifiers. In particular, we consider a novel detectability limited sender and estimate the secure payload of individual images.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • The authors proposed to use the Kullback–Leibler divergence between the statistical distributions of cover and stego images when modeling the cover pixels as a sequence of independent Gaussian random variables with unequal variances (multivariate Gaussian or MVG).
  • Using a rather simple pixel variance estimator, the authors showed that the empirical security of their embedding method was roughly comparable to HUGO but subpar with respect to state-of-the-art steganographic methods.
  • In existing approach was extended by utilizing a better variance estimator and replacing the Gaussian model with the generalized Gaussian. The authors focused on whether it is possible to further improve the security by allowing a pentary embedding operation with a thicker-tail model.

DISADVANTAGES OF EXISTING SYSTEM:

  • The design of distortion functions that measure statistical detectability rather than distortion was identified as one of the most important open problems in the recent motivational review article.
  • These approaches are limited to empirical “models” that need to be learned from a database of images. Such embedding schemes may become “overoptimized” to the feature space and cover source and become highly detectable should theWarden choose a different feature representation.

PROPOSED SYSTEM:

  • For the first time empirical detectors can be compared with optimal detectors and evaluated w.r.t. the performance bound valid within the chosen cover model.
  • In particular, when forcing the heteroscedastic model of sensor acquisition noise to an artificial image with simple content, we observed that the difference in performance between the optimal likelihood-ratio detector and empirical detectors built as classifiers using rich media models is rather small. This indicates that in this source, current empirical steganalysis is near optimal.
  • We introduce a novel type of the so-called “detectability limited sender” that adjusts the payload size for each image to not exceed a prescribed level of statistical detectability within the chosen model. On a database of real images, we contrast the theoretical security of this detectability-limited sender dictated by the model with the one obtained empirically using classifiers employing rich models. Despite the fact that the empirical detector can capture more complex dependencies between pixels than our MVG model, its detection power is much smaller. We attribute this suboptimality primarily to the difficulty of empirical detectors to deal with content heterogeneity of real images.
  • The availability of a closed-form expression for the power of the optimal detector allows us to compute the size of the secure payload for a given image and a chosen detectability (risk) level. We compare it with the secure payload size estimated using empirical detectors and draw several interesting and important facts about the interplay between theoretical and empirical detectors.

ADVANTAGES OF PROPOSED SYSTEM:

  • Framework introduced in this paper allows us to consider various types of Warden, which was not possible within the prior art.
  • We model the image noise residual as a sequence of independent quantized Gaussian variables with varying variances.
  • This allows us to achieve the following novel insights into both steganography design and steganalysis.

SYSTEM ARCHITECTURE:

Content-Adaptive Steganography by Minimizing Statistical Detectability

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Vahid Sedighi, Member, IEEE, Rémi Cogranne, Member, IEEE, and Jessica Fridrich, Senior Member, IEEE, “Content-Adaptive Steganography by Minimizing Statistical Detectability”, IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 11, NO. 2, FEBRUARY 2016.

Privacy-Preserving Outsourced Association Rule Mining on Vertically Partitioned Databases

Privacy-Preserving Outsourced Association Rule Mining on Vertically Partitioned Databases

Privacy-Preserving Outsourced Association Rule Mining on Vertically Partitioned Databases

ABSTRACT:

Association rule mining and frequent itemset mining are two popular and widely studied data analysis techniques for a range of applications. In this paper, we focus on privacy preserving mining on vertically partitioned databases. In such a scenario, data owners wish to learn the association rules or frequent itemsets from a collective dataset, and disclose as little information about their (sensitive) raw data as possible to other data owners and third parties. To ensure data privacy, we design an efficient homomorphic encryption scheme and a secure comparison scheme. We then propose a cloud-aided frequent itemset mining solution, which is used to build an association rule mining solution. Our solutions are designed for outsourced databases that allow multiple data owners to efficiently share their data securely without compromising on data privacy. Our solutions leak less information about the raw data than most existing solutions. In comparison to the only known solution achieving a similar privacy level as our proposed solutions, the performance of our proposed solutions is 3 to 5 orders of magnitude higher. Based on our experiment findings using different parameters and datasets, we demonstrate that the run time in each of our solutions is only one order higher than that in the best non-privacy-preserving data mining algorithms. Since both data and computing work are outsourced to the cloud servers, the resource consumption at the data owner end is very low.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Classic frequent itemset mining and association rule mining algorithms, such as Apriori, Eclat and FP-growth, were designed for a centralized database setting where the raw data is stored in the central site for mining.
  • Privacy concerns were not considered in this setting. Vaidya and Clifton and Kantarcioglu and Clifton are the first to identify and address privacy issues in horizontally / vertically partitioned databases.
  • Due to an increased understanding of the importance of data privacy (e.g. in the aftermath of the revelations by Edward Snowden, a former NSA contractor), a number of privacy-preserving mining solutions have been proposed in recent times. In their settings, there are multiple data owners wishing to learn association rules or frequent itemsets from their joint data. However, the data owners are not willing to send their raw data to a central site due to privacy concerns. If each data owner has one or more rows (i.e. transactions) in the joint database, we say that the database is horizontally partitioned. If each data owner has one or more columns in the joint database, the database is considered vertically partitioned.

DISADVANTAGES OF EXISTING SYSTEM:

  • No Privacy
  • Less Efficiency
  • Computational complexity is significantly lower.

PROPOSED SYSTEM:

  • This paper focuses on vertically partitioned databases, such databases are useful for market basket analysis. For example, different businesses, such as a fashion designer and a luxury watch designer, sell different products to the same community. These businesses collaborate to mine customer buying patterns from the joint database.
  • A transaction of the database contains the products that a customer had bought from one or more of the participating businesses, and attributes such as the customer credit card number and date of purchase are used as TIDs. Therefore, each of the businesses (i.e. data owners) will own some transaction partitions in the joint database. However, these businesses may not wish to disclose such data, which include trade secrets (e.g. there may be other competing businesses sharing the same joint database) and customer privacy (e.g. due to regulations in existing privacy regime). Therefore, a privacy-preserving mining solution must be applied
  • In this paper, we propose a cloud-aided privacy-preserving frequent itemset mining solution for vertically partitioned databases, which is then used to build a privacy-preserving association rule mining solution. Both solutions are designed for applications where data owners have a high level of privacy requirement. The solutions are also suitable for data owners looking to outsource data storage – i.e. data owners can outsource their encrypted data and mining task to a semitrusted (i.e. curious but honest) cloud in a privacy-preserving manner.

ADVANTAGES OF PROPOSED SYSTEM:

  • To the best of our knowledge, this is the first work on outsourced association rule mining and frequent itemset mining for vertically partitioned databases. The key underlying techniques in our solutions are an efficient homomorphic encryption scheme and a secure outsourced comparison scheme.
  • This paper proposes privacy-preserving mining solutions for high privacy requirements.
  • This paper proposes an efficient homomorphic encryption scheme and a secure outsourced comparison scheme.
  • This paper proposes a ciphertext tag approach for canceling out fictitious data’s effect on mining result.

SYSTEM ARCHITECTURE:

Privacy-Preserving Outsourced Association Rule Mining

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Lichun Li, Rongxing Lu, Senior Member, IEEE, Kim-Kwang Raymond Choo, Senior Member, IEEE, Anwitaman Datta, and Jun Shao, “Privacy-Preserving Outsourced Association Rule Mining on Vertically Partitioned Databases”, IEEE Transactions on Information Forensics and Security, 2016.

ID2S Password-Authenticated Key Exchange Protocols

ID2S Password-Authenticated Key Exchange Protocols

ID2S Password-Authenticated Key Exchange Protocols

ABSTRACT:

In two-server password-authenticated key exchange (PAKE) protocol, a client splits its password and stores two shares of its password in the two servers, respectively, and the two servers then cooperate to authenticate the client without knowing the password of the client. In case one server is compromised by an adversary, the password of the client is required to remain secure. In this paper, we present two compilers that transform any two-party PAKE protocol to a two-server PAKE protocol on the basis of the identity-based cryptography, called ID2S PAKE protocol. By the compilers, we can construct ID2S PAKE protocols which achieve implicit authentication. As long as the underlying two-party PAKE protocol and identity-based encryption or signature scheme have provable security without random oracles, the ID2S PAKE protocols constructed by the compilers can be proven to be secure without random oracles. Compared with the Katz et al.’s two-server PAKE protocol with provable security without random oracles, our ID2S PAKE protocol can save from 22% to 66% of computation in each server.

PROJECT OUTPUT VIDEO:

EXISTING SYSTEM:

  • In the single-server setting, all the passwords necessary to authenticate clients are stored in a single server. If the server is compromised, due to, for example, hacking or even insider attacks, passwords stored in the server are all disclosed. This is also true to Kerberos, where a user authenticates against the authentication server with his username and password and obtains a token to authenticate against the service server.
  • PAKE protocols in the single-server setting can be classified into three categories as follows: Password-only PAKE, PKI-based and PAKE ID-based PAKE

DISADVANTAGES OF EXISTING SYSTEM:

  • In PAKE, where two parties, based only on their knowledge of a password, establish a cryptographic key by exchange of messages.
  • A PAKE protocol has to be immune to on-line and off-line dictionary attacks. In an off-line dictionary attack, an adversary exhaustively tries all possible passwords in a dictionary in order to determine the password of the client on the basis of the exchanged messages.
  • In on-line dictionary attack, an adversary simply attempts to login repeatedly, trying each possible password. By cryptographic means only, none of PAKE protocols can prevent on-line dictionary attacks. But on-line attacks can be stopped simply by setting a threshold to the number of login failures.

PROPOSED SYSTEM:

  • In this paper, we propose a new compiler for ID2S PAKE protocol based on any identity-based signature scheme (IBS), such as the Paterson et al.’s scheme.
  • The basic idea is: The client splits its password into two shares and each server keeps one share of the password in addition to a private key related to its identity for signing.
  • In key exchange, each server sends the client its public key for encryption with its identity-based signature on it. The signature can be verified by the client on the basis of the identity of the server.
  • If the signature is genuine, the client submits to the server one share of the password encrypted with the public key of the server. With the decryption keys, both servers can derive the same one-time password, by which the two servers can run a two-party PAKE protocol to authenticate the client.

ADVANTAGES OF PROPOSED SYSTEM:

  • We have implemented our ID2S PAKE protocols, it shows that our protocols save from 22% to 66% of computation in each server, compared with the Katz et al.’s protocol.
  • The server performance is critical to the performance of the whole protocol when the servers provide services to a great number of clients concurrently.
  • Our Protocol shows that less than one second is needed for the client to execute our protocols.
  • In the real world, a protocol determines how users behave in response to input from their environments. In the formal model, these inputs are provided by the adversary. Each user is assumed to be able to execute the protocol multiple times (possibly concurrently) with different partners.
  • This is modeled by allowing each user to have unlimited number of instances with which to execute the protocol.

SYSTEM ARCHITECTURE:

ID2S Password-Authenticated Key Exchange Protocols

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET, C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Xun Yi, Fang-Yu Rao, Zahir Tari, Feng Hao, Elisa Bertino, Ibrahim Khalil and Albert Y. Zomaya, “ID2S Password-Authenticated Key Exchange Protocols”, IEEE Transactions on Computers, 2016.

Detecting and Removing Web Application Vulnerabilities with Static Analysis and Data Mining

Detecting and Removing Web Application Vulnerabilities with Static Analysis and Data Mining

Detecting and Removing Web Application Vulnerabilities with Static Analysis and Data Mining

ABSTRACT:

Although a large research effort on web application security has been going on for more than a decade, the security of web applications continues to be a challenging problem. An important part of that problem derives from vulnerable source code, often written in unsafe languages like PHP. Source code static analysis tools are a solution to find vulnerabilities, but they tend to generate false positives, and require considerable effort for programmers to manually fix the code. We explore the use of a combination of methods to discover vulnerabilities in source code with fewer false positives. We combine taint analysis, which finds candidate vulnerabilities, with data mining, to predict the existence of false positives. This approach brings together two approaches that are apparently orthogonal: humans coding the knowledge about vulnerabilities (for taint analysis), joined with the seemingly orthogonal approach of automatically obtaining that knowledge (with machine learning, for data mining). Given this enhanced form of detection, we propose doing automatic code correction by inserting fixes in the source code. Our approach was implemented in the WAP tool, and an experimental evaluation was performed with a large set of PHP applications. Our tool found 388 vulnerabilities in 1.4 million lines of code. Its accuracy and precision were approximately 5% better than PhpMinerII’s and 45% better than Pixy’s.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • There is a large corpus of related work, so we just summarize the main areas by discussing representative papers, while leaving many others unreferenced to conserve space.
  • Static analysis tools automate the auditing of code, either source, binary, or intermediate.
  • Taint analysis tools like CQUAL and Splint (both for C code) use two qualifiers to annotate source code: the untainted qualifier indicates either that a function or parameter returns trustworthy data (e.g., a sanitization function), or a parameter of a function requires trustworthy data (e.g., mysql_query). The tainted qualifier means that a function or a parameter returns non-trustworthy data (e.g., functions that read user input).

DISADVANTAGES OF EXISTING SYSTEM:

  • These other works did not aim to detect bugs and identify their location, but to assess the quality of the software in terms of the prevalence of defects and vulnerabilities.
  • WAP does not use data mining to identify vulnerabilities, but to predict whether the vulnerabilities found by taint analysis are really vulnerabilities or false positives.
  • AMNESIA does static analysis to discover all SQL queries, vulnerable or not; and in runtime it checks if the call being made satisfies the format defined by the programmer.
  • WebSSARI also does static analysis, and inserts runtime guards, but no details are available about what the guards are, or how they are inserted.

PROPOSED SYSTEM:

  • This paper explores an approach for automatically protecting web applications while keeping the programmer in the loop. The approach consists in analyzing the web application source code searching for input validation vulnerabilities, and inserting fixes in the same code to correct these flaws. The programmer is kept in the loop by being allowed to understand where the vulnerabilities were found, and how they were corrected.
  • This approach contributes directly to the security of web applications by removing vulnerabilities, and indirectly by letting the programmers learn from their mistakes. This last aspect is enabled by inserting fixes that follow common security coding practices, so programmers can learn these practices by seeing the vulnerabilities, and how they were removed.
  • We explore the use of a novel combination of methods to detect this type of vulnerability: static analysis with data mining. Static analysis is an effective mechanism to find vulnerabilities in source code, but tends to report many false positives (non-vulnerabilities) due to its undecidability
  • To predict the existence of false positives, we introduce the novel idea of assessing if the vulnerabilities detected are false positives using data mining. To do this assessment, we measure attributes of the code that we observed to be associated with the presence of false positives, and use a combination of the three top-ranking classifiers to flag every vulnerability as false positive or not.

ADVANTAGES OF PROPOSED SYSTEM:

  • Ensuring that the code correction is done correctly requires assessing that the vulnerabilities are removed, and that the correct behavior of the application is not modified by the fixes.
  • We propose using program mutation and regression testing to confirm, respectively, that the fixes function as they are programmed to (blocking malicious inputs), and that the application remains working as expected (with benign inputs).
  • The main contributions of the paper are: 1) an approach for improving the security of web applications by combining detection and automatic correction of vulnerabilities in web applications; 2) a combination of taint analysis and data mining techniques to identify vulnerabilities with low false positives; 3) a tool that implements that approach for web applications written in PHP with several database management systems; and 4) a study of the configuration of the data mining component, and an experimental evaluation of the tool with a considerable number of open source PHP applications.

SYSTEM ARCHITECTURE:

Detecting and Removing Web Application Vulnerabilities

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Ibéria Medeiros, Nuno Neves, Member, IEEE, and Miguel Correia, Senior Member, IEEE, “Detecting and Removing Web Application Vulnerabilities with Static Analysis and Data Mining”, IEEE TRANSACTIONS ON RELIABILITY, VOL. 65, NO. 1, MARCH 2016.

Real-Time Semantic Search Using Approximate Methodology for Large-Scale Storage Systems

Real-Time Semantic Search Using Approximate Methodology for Large-Scale Storage Systems

Real-Time Semantic Search Using Approximate Methodology for Large-Scale Storage Systems

ABSTRACT:

The challenges of handling the explosive growth in data volume and complexity cause the increasing needs for semantic queries. The semantic queries can be interpreted as the correlation-aware retrieval, while containing approximate results. Existing cloud storage systems mainly fail to offer an adequate capability for the semantic queries. Since the true value or worth of data heavily depends on how efficiently semantic search can be carried out on the data in (near-) real-time, large fractions of data end up with their values being lost or significantly reduced due to the data staleness. To address this problem, we propose a near-real-time and cost-effective semantic queries based methodology, called FAST. The idea behind FAST is to explore and exploit the semantic correlation within and among datasets via correlation-aware hashing and manageable flat-structured addressing to significantly reduce the processing latency, while incurring acceptably small loss of data-search accuracy. The near-real-time property of FAST enables rapid identification of correlated files and the significant narrowing of the scope of data to be processed. FAST supports several types of data analytics, which can be implemented in existing searchable storage systems. We conduct a real-world use case in which children reported missing in an extremely crowded environment (e.g., a highly popular scenic spot on a peak tourist day) are identified in a timely fashion by analyzing 60 million images using FAST. FAST is further improved by using semantic-aware namespace to provide dynamic and adaptive namespace management for ultra-large storage systems. Extensive experimental results demonstrate the efficiency and efficacy of FAST in the performance improvements.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • ISABELAQA is a parallel query processing engine that is designed and optimized for analyzing and processing spatiotemporal, multivariate scientific data. MixApart uses an integrated data caching and scheduling solution to allow MapReduce computations to analyze data stored on enterprise storage systems.
  • The frontend caching layer enables the local storage performance required by data analytics. The shared storage back-end simplifies data management.
  • Spyglass exploits the locality of file namespace and skewed distribution of metadata to map the namespace hierarchy into a multi-dimensional K-D tree and uses multilevel versioning and partitioning to maintain consistency.
  • Glance, a just-in-time sampling-based system, can provide accurate answers for aggregate and top-k queries without prior knowledge.

DISADVANTAGES OF EXISTING SYSTEM:

  • Existing content-based analysis tools not only cause high complexity and costs, but also fail to effectively handle the massive amounts of files.
  • The high complexity routinely leads to very slow processing operations and very high and often unacceptable latency. Due to the unacceptable latency, the staleness of data severely diminishes the value of data.
  • Existing approaches to unstructured data search and analytics rely on either system-based chunks of data files.
  • Due to the long latency incurred in data processing and the resulting data staleness, the value/worth of data becomes diminished and eventually nullified.

PROPOSED SYSTEM:

  • In the context of this paper, searchable data analytics are interpreted as obtaining data value/worth via queried results, such as finding a valuable record, a correlated process ID, an important image, a rebuild system log, etc.
  • We propose a novel near-real-time methodology for analyzing massive data, called FAST, with a design goal of efficiently processing such data in a real-time manner.
  • The key idea behind FAST is to explore and exploit the correlation property within and among datasets via improved correlation aware hashing and flat-structured addressing to significantly reduce the processing latency of parallel queries, while incurring acceptably small loss of accuracy.
  • The approximate scheme for real-time performance has been widely recognized in system design and high-end computing. In essence, FAST goes beyond the simple combination of existing techniques to offer efficient data analytics via significantly increased processing speed. Through the study of the FAST methodology, we aim to make the following contributions for near real-time data analytics.

ADVANTAGES OF PROPOSED SYSTEM:

  • Space-efficient summarization
  • Energy efficiency via hashing
  • Semantic-aware namespace
  • Real system implementation

SYSTEM ARCHITECTURE:

Real-Time Semantic Search Using Approximate Methodology

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : NET,C#.NET
  • Tool : Visual Studio 2008
  • Database : SQL SERVER 2005

REFERENCE:

Yu Hua, Senior Member, IEEE, Hong Jiang, Fellow, IEEE, and Dan Feng, Member, IEEE, “Real-Time Semantic Search Using Approximate Methodology for Large-Scale Storage Systems”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 27, NO. 4, APRIL 2016.