Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data

Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data

ABSTRACT:

Frequent itemsets mining with differential privacy refers to the problem of mining all frequent itemsets whose supports are above a given threshold in a given transactional dataset, with the constraint that the mined results should not break the privacy of any single transaction. Current solutions for this problem cannot well balance efficiency, privacy, and data utility over large-scale data. Toward this end, we propose an efficient, differential private frequent itemsets mining algorithm over large-scale data. Based on the ideas of sampling and transaction truncation using length constraints, our algorithm reduces the computation intensity, reduces mining sensitivity, and thus improves data utility given a fixed privacy budget. Experimental results show that our algorithm achieves better performance than prior approaches on multiple datasets.

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS: 

  • System : Pentium Dual Core.
  • Hard Disk : 120 GB.
  • Monitor : 15’’ LED
  • Input Devices : Keyboard, Mouse
  • Ram : 1 GB

SOFTWARE REQUIREMENTS: 

  • Operating system : Windows 7.
  • Coding Language : Python
  • Database : MYSQL

REFERENCE:

XINYU XIONG1, FEI CHEN 1,2, PEIZHI HUANG1, MIAOMIAO TIAN3, XIAOFANG HU4, BADONG CHEN 5, AND JING QIN, “Frequent Itemsets Mining With Differential Privacy Over Large-Scale Data”, IEEE ACCESS, 2018.

About the Author