Title: Data De-identification
1Data De-identification
Source www.dcmsys.com
2Automating Data De-identification One aspect of
medical technology innovation that is massively
relevant to DICOM Systems at the moment is the
concept of data de-identification.
De-identification is a crucial aspect of attempts
to advance medical technology at this time, as
billions of dollars find their way to firms
promising huge bounds forward in AI for diagnosis
in various fields. Half of all surveyed chief
information officers in health enterprises are
planning to deploy artificial intelligence in
some form either this year or next year. It's a
field with incredible potential, restricted by
the essential need to protect patients' personal
information. Any way of automating the data
de-identification process will massively boost
productivity and give the whole field of medical
AI a real shot in the arm.
Source www.dcmsys.com
3 The Necessity of Mass Data De-identification Hu
mans are in. a number of ways, easier to teach
than artificial intelligence. That's because
humans have been wired and conditioned over
millions of years of evolution to recognize
patterns, extrapolate, and intuit from incomplete
data. AIs have a much shorter development time,
and need to be taught from the ground up what
conclusions they should draw from the data they
receive. Since they haven't reached the level of
sophistication necessary to teach them the
flashes of human inspiration and intuition that
serve many medical professionals so well, we have
to resort to sheer brute force rote learning. Any
AI that wants to learn even a minute amount about
medical imaging needs to be fed a vast amount of
data before it can be relied upon to make
accurate assessments of medical imaging files. At
a minimum, 100,000 samples are required. This
enormous need for clean data renders manual
attempts to de-identify medical files completely
impractical. That's where we come in.
Source www.dcmsys.com
4Source www.dcmsys.com
5 DICOM's Data De-identification System DICOM
does not itself deal in artificial intelligence
algorithms, but we specialize in piping in the
gallons of data needed to form a data lake from
which an AI can draw to develop. To convert
healthcare providers' imaging data into safe,
de-identified data that the AI handlers can use,
we have two scripts. The first works on the
metadata of the file, finding and stripping out
identifying information 18 different kinds,
including name, religion, and age. With this data
securely eliminated, there is no way a patient
can be identified from the metadata. The second
script goes to work on the image file itself,
neutralizing data (for example dates, patient
number, hospital location etc.). There are
several options that are available, depending on
one's preference scrambling identifying data in
the file, or masking it entirely. It's important
to note that, since the script goes to work on
enormous sets of files indiscriminately, the data
fed into it needs to be uniform. It can't detect
if there are, for example data from two different
hospitals with different notation policies, with
five thousand images having their notations on
the top, and five thousand having them on the
right. It will merely scramble or mask the
section of the image it is told to regardless of
whether the notations are there or not. Be sure
of the content of your image files before you set
this script to work. Luckily, this process comes
with its own QA stage, so as to ensure that any
human oversights can be corrected before final
dispatch.
Source www.dcmsys.com
6Source https//www.dcmsys.com/.Information
shared above is the personal opinion of the
author and not affiliated with the website.
Source www.dcmsys.com