Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation

Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation

ABSTRACT:

With the wide deployment of public cloud computing infrastructures, using clouds to host data query services has become an appealing solution for the advantages on scalability and cost-saving. However, some data might be sensitive that the data owner does not want to move to the cloud unless the data confidentiality and query privacy are guaranteed. On the other hand, a secured query service should still provide efficient query processing and significantly reduce the in-house workload to fully realize the benefits of cloud computing. We propose the random space perturbation (RASP) data perturbation method to provide secure and efficient range query and kNN query services for protected data in the cloud. The RASP data perturbation method combines order preserving encryption, dimensionality expansion, random noise injection, and random projection, to provide strong resilience to attacks on the perturbed data and queries. It also preserves multidimensional ranges, which allows existing indexing techniques to be applied to speedup range query processing. The kNN-R algorithm is designed to work with the RASP range query algorithm to process the kNN queries. We have carefully analyzed the attacks on data and queries under a precisely defined threat model and realistic security assumptions. Extensive experiments have been conducted to show the advantages of this approach on efficiency and security.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:

  • Requirements for constructing a practical query service in the cloud as the CPEL criteria: data confidentiality, query privacy, efficient query processing, and low in-house processing cost. Satisfying these requirements will dramatically increase the complexity of constructing query services in the cloud. Some related approaches have been developed to address some aspects of the problem.
  • The crypto index and order preserving encryption (OPE) are vulnerable to the attacks. The enhanced crypto index approach puts heavy burden on the in-house infrastructure to improve the security and privacy.

DISADVANTAGES OF EXISTING SYSTEM:

  • Do not satisfactorily addressing all aspects of Cloud.
  • Increase the complexity of constructing query services in the cloud.
  • Provide slow query services as a result of security and privacy assurance.

PROPOSED SYSTEM:

  • We propose the random space perturbation (RASP) data perturbation method to provide secure and efficient range query and kNN query services for protected data in the cloud.
  • The RASP data perturbation method combines order preserving encryption, dimensionality expansion, random noise injection, and random projection, to provide strong resilience to attacks on the perturbed data and queries.

ADVANTAGES OF PROPOSED SYSTEM:

  • The RASP perturbation is a unique combination of OPE, dimensionality expansion, random noise injection, and random projection, which provides strong confidentiality guarantee.
  • The RASP approach preserves the topology of multi-dimensional range in secure transformation, which allows indexing and efficiently query processing.
  • The proposed service constructions are able to minimize the in-house processing workload because of the low perturbation cost and high precision query results. This is an important feature enabling practical cloud-based solutions.

 SYSTEM ARCHITECTURE:

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

 

  • System                          :         Pentium IV 2.4 GHz.
  • Hard Disk                      :         40 GB.
  • Floppy Drive                 :         1.44 Mb.
  • Monitor                         :         15 VGA Colour.
  • Mouse                            :         Logitech.
  • Ram                               :         512 Mb.

 SOFTWARE REQUIREMENTS:

 

  • Operating system           :         Windows XP/7.
  • Coding Language:         JAVA/J2EE
  • IDE                      :         Netbeans 7.4
  • Database              :         MYSQL

REFERENCE:

Huiqi Xu, Shumin Guo, and Keke Chen,“Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 2, FEBRUARY 2014.

A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems

A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems

A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems

ABSTRACT:
Cloud data center management is a key problem due to the numerous and heterogeneous strategies that can be applied, ranging from the VM placement to the federation with other clouds. Performance evaluation of cloud computing infrastructures is required to predict and quantify the cost-benefit of a strategy portfolio and the corresponding quality of service (QoS) experienced byusers. Such analyses are not feasible by simulation or on-the-field experimentation, due to the great number of parameters that have to be investigated. In this paper, we present an analytical model, based on stochastic reward nets (SRNs), that is both scalable to model systems composed of thousands of resources and flexible to represent different policies and cloud-specific strategies. Several performance metrics are defined and evaluated to analyze the behavior of a cloud data center: utilization, availability, waiting time, and responsiveness. A resiliency analysis is also provided to take into account load bursts. Finally, a general approach is presented that,starting from the concept of system capacity, can help system managers to opportunely set the data center parameters under different working conditions.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:
 In order to integrate business requirements and application level needs, in terms of Quality of Service (QoS), cloud service provisioning is regulated by Service Level Agreements (SLAs): contracts between clients and providers that express the price for a service, the QoS levels required during the service provisioning, and the penalties associated with the SLA violations. In such a context, performance evaluation plays a key role allowing system managers to evaluate the effects of different resource management strategies on the data center functioning and to predict the corresponding costs/benefits.
 Cloud systems differ from traditional distributed systems. First of all, they are characterized by a very large number of resources that can span different administrative domains. Moreover, the high level of resource abstraction allows to implement particular resource management techniques such as VM multiplexing or VM live migration that, even if transparent to final users, have to be considered in the design of performance models in order to accurately understand the system behavior. Finally, different clouds, belonging to the same or to different organizations, can dynamically join each other to achieve a common goal, usually represented by the optimization of resources utilization. This mechanism, referred to as cloud federation, allows to provide and release resources on demand thus providing elastic capabilities to the whole infrastructure.

DISADVANTAGES OF EXISTING SYSTEM:
 On-the-field experiments are mainly focused on the offered QoS, they are based on a black box approach that makes difficult to correlate obtained data to the internal resource management strategies implemented by the system provider.
 Simulation does not allow to conduct comprehensive analyses of the system performance due to the great number of parameters that have to be investigated.

PROPOSED SYSTEM:
 In this paper, we present a stochastic model, based on Stochastic Reward Nets (SRNs), that exhibits the above mentioned features allowing to capture the key concepts of an IaaS cloud system.
 The proposed model is scalable enough to represent systems composed of thousands of resources and it makes possible to represent both physical and virtual resources exploiting cloud specific concepts such as the infrastructure elasticity. With respect to the existing literature, the innovative aspect of the present work is that a generic and comprehensive view of a cloud system is presented.
 Low level details, such as VM multiplexing, are easily integrated with cloud based actions such as federation, allowing to investigate different mixed strategies. An exhaustive set of performance metrics are defined regarding both the system provider (e.g., utilization) and the final users (e.g., responsiveness).

ADVANTAGES OF PROPOSED SYSTEM:
 To provide a fair comparison among different resource management strategies, also taking into account the system elasticity, a performance evaluation approach is described.
 Such an approach, based on the concept of system capacity, presents a holistic view of a cloud system and it allows system managers to study the better solution with respect to an established goal and to opportunely set the system parameters.

SYSTEM ARCHITECTURE:

A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems

MODULES:
1. System Queuing
2. Scheduling Module
3. VM Placement Module
4. Federation Module
5. Arrival Process

MODULES DESCRIPTION:
1. System Queuing:
Job requests (in terms of VM instantiation requests) are en-queued in the system queue. Such a queue has a finite size Q, once its limit is reached further requests are rejected. The system queue is managed according to a FIFO scheduling policy.
2. Scheduling Module:
When a resource is available a job is accepted and the corresponding VM is instantiated. We assume that the instantiation time is negligible and that the service time (i.e., the time needed to execute a job) is exponentially distributed with mean1/μ.

3. VM Placement:
According to the VM multiplexing technique the cloud system can provide a number M of logical resources greater than N. In this case, multiple VMs can be allocated in the same physical machine (PM), e.g., a core in a multicore architecture. Multiple VMs sharing the same PM can incur in a reduction of the performance mainly due to I/O interference between VMs.

4. Federation Module:
Cloud federation allows the system to use, in particular situations, the resources offered by other public cloud systems through a sharing and paying model. In this way, elastic capabilities can be exploited in order to respond to particular load conditions. Job requests can be redirected to other clouds by transferring the corresponding VM disk images through the network.
5. Arrival Process:
Finally, we respect to the arrival process we will investigate three different scenarios. In the first one (Constant arrival process) we assume the arrival process be a homogeneous Poisson process with rate λ. However, large scale distributed systems with thousands of users, such as cloud systems, could exhibit self-similarity/long-range dependence with respect to the arrival process. The last scenario (Bursty arrival process) takes into account the presence of a burst whit fixed and short duration and it will be used in order to investigate the system resiliency
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.
 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP/7.
 Coding Language : JAVA/J2EE
 IDE : Netbeans 7.4
 Database : MYSQL

REFERENCE:
Dario Bruneo ,“A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems”,VOL. 25, NO. 3, MARCH 2014.

A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud

A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud

A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud

ABSTRACT:
A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as k-anonymity is a widely used category of privacy preserving techniques. At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly used software tools to capture, manage, and process such large-scale data within a tolerable elapsed time. As a result, it is a challenge for existing anonymization approaches to achieve privacy preservation on privacy-sensitive large-scale data sets due to their insufficiency of scalability. In this paper, we propose a scalable two-phase top-down specialization (TDS) approach to anonymize large-scale data sets using the MapReduce framework on cloud. In both phases of our approach, we deliberately design a group of innovative MapReduce jobs to concretely accomplish the specialization computation in a highly scalable way. Experimental evaluation results demonstrate that with our approach, the scalability and efficiency of TDS can be significantly improved over existing approaches.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:
 A widely adopted parallel data processing framework, to address the scalability problem of the top-down specialization (TDS) approach for large-scale data anonymization. The TDS approach, offering a good tradeoff between data utility and data consistency, is widely applied for data anonymization. Most TDS algorithms are centralized, resulting in their inadequacy in handling largescale data sets. Although some distributed algorithms have been proposed, they mainly focus on secure anonymization of data sets from multiple parties, rather than the scalability aspect.

DISADVANTAGES OF EXISTING SYSTEM:
 The MapReduce computation paradigm still a challenge to design proper MapReduce jobs for TDS.

PROPOSED SYSTEM:
 In this paper, we propose a scalable two-phase top-down specialization (TDS) approach to anonymize large-scale data sets using the MapReduce framework on cloud.
 In both phases of our approach, we deliberately design a group of innovative MapReduce jobs to concretely accomplish the specialization computation in a highly scalable way.

ADVANTAGES OF PROPOSED SYSTEM:
 Accomplish the specializations in a highly scalable fashion.
 Gain high scalability.
 Significantly improve the scalability and efficiency of TDS for data anonymization over existing approaches.

SYSTEM ARCHITECTURE:

A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.
 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP/7.
 Coding Language : JAVA/J2EE
 IDE : Netbeans 7.4
 Database : MYSQL

REFERENCE:
Xuyun Zhang, Laurence T. Yang,Chang Liu, and Jinjun Chen,“A Scalable Two-Phase Top-DownSpecialization Approach for Data Anonymization Using MapReduce on Cloud”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 2, FEBRUARY 2014

A Hybrid Cloud Approach for Secure Authorized Deduplication

A Hybrid Cloud Approach for Secure Authorized Deduplication

A Hybrid Cloud Approach for Secure Authorized Deduplication

ABSTRACT:
Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. To protect the confidentiality of sensitive data while supporting deduplication, the convergent encryption technique has been proposed to encrypt the data before outsourcing. To better protect data security, this paper makes the first attempt to formally address the problem of authorized data deduplication. Different from traditional deduplication systems, the differential privileges of users are further considered in duplicate check besides the data itself. We also present several new deduplication constructions supporting authorized duplicate check in a hybrid cloud architecture. Security analysis demonstrates that our scheme is secure in terms of the definitions specified in the proposed security model. As a proof of concept, we implement a prototype of our proposed authorized duplicate check scheme and conduct testbed experiments using our prototype. We show that our proposed authorized duplicate check scheme incurs minimal overhead compared to normal operations.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):

EXISTING SYSTEM:
 Data deduplication systems, the private cloud is involved as a proxy to allow data owner/users to securely perform duplicate check with differential privileges.
 Such architecture is practical and has attracted much attention from researchers.
 The data owners only outsource their data storage by utilizing public cloud while the data operation is managed in private cloud.

DISADVANTAGES OF EXISTING SYSTEM:
 Traditional encryption, while providing data confidentiality, is incompatible with data deduplication.
 Identical data copies of different users will lead to different ciphertexts, making deduplication impossible.

PROPOSED SYSTEM:
In this paper, we enhance our system in security. Specifically, we present an advanced scheme to support stronger security by encrypting the file with differential privilege keys. In this way, the users without corresponding privileges cannot perform the duplicate check. Furthermore, such unauthorized users cannot decrypt the cipher text even collude with the S-CSP. Security analysis demonstrates that our system is secure in terms of the definitions specified in the proposed security model.
ADVANTAGES OF PROPOSED SYSTEM:
 The user is only allowed to perform the duplicate check for files marked with the corresponding privileges.
 We present an advanced scheme to support stronger security by encrypting the file with differential privilege keys.
 Reduce the storage size of the tags for integrity check. To enhance the security of deduplication and protect the data confidentiality,

SYSTEM ARCHITECTURE:

A Hybrid Cloud Approach for Secure Authorized Deduplication

SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.
 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP/7.
 Coding Language : JAVA/J2EE
 IDE : Netbeans 7.4
 Database : MYSQL

REFERENCE:
Jin Li, Yan Kit Li, Xiaofeng Chen, Patrick P. C. Lee, Wenjing Lou, “A Hybrid Cloud Approach for Secure Authorized Deduplication”, IEEE Transactions on Parallel and Distributed Systems, 2014.