Proprietary file formats in medicine: A short note on their detrimental effect on patient care and research

Research into, and validation of, current and new medical applications requires access to the raw, i.e. originally measured, data of the many systems used for diagnosis and treatment, ranging from clinical biochemical to multimodal imaging applications.

When doing such research one of the things you quickly notice is that you cannot get to these raw data, cannot get insight into the pre-processing done by built-in software and that the file format of the 'public' data is not disclosed.

Often this state of affairs is defended by the manufacturer as protection of IP. In other cases the manufacturers simply don't give access to the data for no specified reason or only if you sign a paper that effectively makes using that data impossible. Note that the problem is much wider than research. It is just that researchers know what could be done, while the physicians are either unaware of that or simply don't tell the patients that their treatment could have been better.

Let me sum up some important problems associated with proprietary files in medicine.

1) Data recorded a few years ago may become effectively unreadable. Either because the firm does not exist anymore or because the hospital uses a different brand now or because it was recorded in a different hospital. This is unacceptable, particularly in cases of chronic disease. One trend in medicine is that we are increasingly successful in stabilizing diseases, but that results in a sharp increase in patients with chronic diseases.

2) Electronic medical records could help in exchanging data between health care personnel in different hospitals and even different countries. What use is an electronic medical record if large parts of the data are inaccessible? A related question: Why are we still using faxes to exchange ECG's between hospitals?

3) If a hospital wants to change to a new supplier of e.g. ECG equipment, the new firm has to be able to read data from the old supplier. If the file formats are not documented any newcomer in the market has to reverse engineer all files of the major current suppliers. This is costly and may not even be possible technically or legally because of existing IP. This is unfair competition and increases the cost of health care. Moreover, this also leads to a virtual monopoly of current suppliers.

4) An important trend in medicine is patient specific treatment. The reason that we can do that now is because data from many sources can be combined. The combination of diverse imaging modalities or the combination with other measurements such as genetic profiling is what makes patient specific treatment possible. This almost always means combining data from different manufacturers in ways they did not foresee. The major hurdle for patient specific treatment is therefore understanding what is in all these data files.

5) It is no longer the patient that can decide who can look at his data, but the manufacturer. For the physician it means he can no longer provide optimal care.

6) The points above are all problems that hinder standard and state of the art health care. They in themselves should be enough to forbid any file format in publicly funded health care that does not conform to an open standard. Conforming to a standard, however, may not be enough.

Many systems produce files that conform to the DICOM standard. This standard has many standard fields for most measurements. It does, however, also have the escape of using private fields. That is reasonable for documenting e.g. internal settings and for innovative measurements that are not standardized yet. There are manufacturers that produce a DICOM file with very little standard information, sometimes even just a screendump of the system at the time. All relevant information is in one or more big, undocumented, private tags.

7) Then there is research.

a) Trying to extract new information from a system that the manufacturer did not think of is severely hampered. If e.g. someone wants to automatically analyse the QT interval in all ECGs of a subgroup of the patients in a hospital, she might not be able do that because the raw data are inaccessible. Doing a standardized analysis in a multicenter approach is absolutely out of the question. Unless you manually scan the (faxed) paper output. I know studies that did just that.

b) patient specific treatment again. Often not yet mainstream, but possibly life saving for individual patients. Possible only if everything is documented in a readable format, whose specs don't change with every new release.

c) Apart from direct patient care many complex machines are used in molecular diagnostic research, such as sequencers and PCR machines. Researchers have to keep track of what they do and be able to repeat an experiment. That implies that also these machines must be able to produce output in an understandable and documented format.

In conclusion: proprietary and undisclosed file formats hamper treatment of patients, interfere in the patient-physician relation, cannot guarantee long term access to data, hamper exchange of data between hospitals, limit the usefulness of electronic medical records, and stifles innovation. It also costs tax payers money and hampers researchers and (European) inventors.


André C. Linnenbank

This email address is being protected from spambots. You need JavaScript enabled to view it.