[DML10] Towards an Automatic Detection of Sensitive Information in a Database
Conférence Internationale avec comité de lecture :
DBKDA'10, Int. Conf. on Advances in Databases, Knowledge, and Data Applications, Les Menuires,
January 2010,
pp.34-39,
motcle:
Résumé:
Test phase is a crucial step in Information System design.
It is a first real validation of user requirements. In order to
maximize their effectiveness, tests are often conducted on
real data. However developments and tests are more and more
outsourced, leading companies to provide external staff with
real confidential data. A solution to this problem is known as
Data Scrambling. Many algorithms aim at smartly replacing
true data by false but realistic ones. However nothing has
been developed to automate the crucial task of the detection
of the data to be scrambled. In this paper we propose an
innovative approach - and its implementation as an expert
system - to achieve the automatic detection of the candidate
attributes for scrambling. Our approach is mainly based on
semantic rules that determine which concepts have to be
scrambled, and on a linguistic component that retrieves the
attributes that semantically correspond to these concepts.
Since attributes can not be considered independently from
each other we also address the challenging problem of the
propagation of the scrambling among the whole database.
An important contribution of our approach is to provide
a semantic modelling of sensitive data. This knowledge is
made available through production rules, operationalizing the
sensitive data detection.