Join IAMI EyeSoft - software for Ophthalmologists











NBSdb: New Born Screening Database

Buddhaditta Bose 1, Gyan Rajkumar1, M. Narendar Pavan1, A RadhaRama Devi2, H.A. Nagarajaram 1*

1 Laboratory of Computational Biology and Bioinformatics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad- 50076, India

2 Laboratory of Diagnostics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad- 50076, India

* Corresponding author Email: han@cdfd.org.in 

Abstract:

Hyderabad-based Centre for DNA Fingerprinting and Diagnostics (CDFD) has been conducting a population based screening of newborn babies for inborn errors of metabolism. In order to help proper maintenance of the screening data and also to help post-screening data mining, computerization of data has been carried out. The main components of the computerization are: (a) A custom designed, web-enabled, secure form which enables feeding of all the data, baby-wise, to the database server and (b) a database called New Born Screening Database (NBSdb) which is a relational database. Till date NBSdb holds records pertaining to 15,000 babies. This database is currently housed at a highly secure computer server and can only be accessed by authorized personnel. 

Keywords: Computerized Patient Record, medical informatics, Newborn Screening, Inborn errors, Relational Database

1. Introduction

Inborn errors of metabolism is the major cause of mental retardation. Recent understanding of the diseases at molecular level helped in early intervention to prevent disability. Early detection through population screening is the most modern public health preventive program in this country. Newborn Screening is mandatory in developed countries and is undertaken as a pilot program in South Asian countries. Such screening programs are generating a large volume of data and this situation has called for a great need to automate the data for early access, and storage of day-to-day inflow of information. The database that is developed can be brought into use to answer the following questions: (a) What are the benefits of screening in early recognition of disorders?, (b) What is the effectiveness for case-finding (sensitivity, specificity, and positive predictive value)?, (c) Is there any harm emanating out of the program ?, and (d) Is the cost of the program balanced in relation to benefits? 

The Hyderabad-based Centre for DNA Fingerprinting and Diagnostics (CDFD), in collaboration with various hospitals, in and around Hyderabad, has been carrying out screening of newborn babies for in-born errors such as congenital hypothyroidism, congenital adrenal hyperplasia, galactosemia, cystic fibrosis, hemoglobinopathies, Glucose 6 Phosphate Deficiency, biotidinase deficiency and amino acids related disorders. Till today more than 15,000 babies have been screened. In order to systematically computerise all the diagnostics data that has been generated by the screening program, a pilot project was undertaken to construct a relational database called NBSdb. The data includes patients records such as family background, caste, geographical location of the family, case history of any disorders that existed in the family, and the diagnostics details, the concerned reports, and the details of medical and genetic counseling. The database is unique in two ways: 

(a) It stands as the first example of successful application of informatics tools to systematise clinical data in the region.

(b) It forms a knowledge-base in relational database format which allows simple to complex queries in order to extract useful feature relationships. For example, the database can be explored to investigate prevalence of various disorders in the local population and to relate it to ethnic-socio-economic background of families to the disorder prevalence. 

The database is therefore is useful to researchers and students in areas of anthropology, community genetics, bioinformatics, and molecular genetics. 

2. Database Description

2.1. Data Organization
The babies screened for various disorders belong to different region/castes and have different familial, medical, social, economical and ethnical backgrounds. Therefore, the data related to every diagnosed baby are partitioned into the following types: (a) Individual personal data, (b) Diagnostics data obtained from laboratory tests and (c) Remarks from counselling sessions, suggested prescriptions and improvements observed throughout follow-up visits. 

2.1.1 Individual Personal Data
The personal data consist of vital information with regard to the age, location (cities/villages), socio-economic condition, and family history of the baby. The Family history stresses on the type of marriage of the babies parents in terms of whether they are consanguinous or not, and the degree of consanguinity viz., first cousins, second cousins, or uncle-niece or beyond. 

A unique identification number is assigned to each baby and is maintained all through the entire regime of follow-ups and treatments. This effort is required to capture all the clinical data and greatly facilitates individual-specific data organisation. This organization will ensure that the medical record of the babies can be used for future reference in assessing the clinical information. 

2.1.2 Diagnostics data
A baby is considered to be suffering from certain deficiencies, or a carrier of certain genetic disorder which is found out with the help of certain diagnostic tests. These tests fall under two categories: qualitative and quantitative. While qualitative tests indicate the likelihood of presence of certain disorders/deficiencies, the quantitative tests measure the extent of the deficiencies. Table 2 gives the list of tests conducted and the categories to which they belong. Screening is mostly qualitative initially and if proved positive, the quantitative tests are conducted. The quantitative tests give rise to data about whether they are normal, abnormal or transient. Babies which have elevated (abnormal) values usually come back to the screening center for follow-ups and get diagnosed until their test value reaches a normal stage. All these tests results are stored in a systematic, organised manner in the database.

2.1.3 Counselling
Based on the data obtained from the tests, the developmental assessment of the babies is carried out, necessary follow-up investigations are suggested, family counseling is provided, and details are entered into the database. The data can be utilized in assessing response to bear the medical treatment, the underlying familial factors, predisposition to certain diseases etc.

2.2. The system Architecture 
This database endeavor is configured as a true client-server system. The client server request/response handling is all done using Servlets and Java Server Pages. The request/response handling is essential in querying the data entered in the database. The data entered is thus stored in the backend Relational Database (See Figure 1). 

Missing Image


Figure 1. Schematic of Architecture

An n-tier Database access model is used to make our application talk to the back end database. The connectivity between the front end application and the back end database has been accomplished through the use of Java Database Connectivity (JDBC) interface and uses the MySql database, which emphasizes that it is easy to use and takes care of speed security and recoverability. State of the art Sun server, JRun Java Web server have been used for the development purpose (hosting the application) with Solaris as the platform.

2.3. The design of front-end to the database 
The front-end here means the software which is visible to the user who feeds in the data into the computer. The front-end generates a number of web pages dynamically in response to user inputs and queries and has been developed using HTML Forms which is one of the most common ways to gather information from the front end client. Javascript is used for automated validation of certain data entered by the user. The validation is carried out by means of certain checks to ensure accuracy, and quality of the data. The first and the foremost important feature to be checked is the user authentication and this has been provided because of the reason that the diagnostics data involve personal data and this has to be maintained very confidentially. Access to the application is permissible only to the authorized users of the system. The first page called Login page prompts the users to enter their username and password allotted and only after validation the users are allowed to go through the next phase of the application which is the data entry page.

The Data entry page (Figure 2) contains the data fields to enter the necessary information. Description of the data to be entered in this data entry page has been already mentioned before.

Missing Image

Figure 2. Snapshot of the data entry page (HTML Form) listing the fields to be entered.

The Confirmation/Modification Page pops up next which gives the user the opportunity to view the data entered by him/her in the data entry page and also allows him to modify his typographical errors, correct or not and if the user feels that the data entered is correct.

Additionally, a Data updation page is provided to update the records of the patients in case of multiple visits/followups by the patient and a Data query page is provided to allow the users to query the patient data stored in the database.

2.4. Database Design
The database has been designed to ensure that the stored data is free from internal contradictions, has specific constraints enforced on it so that when multiple users simultaneously try to query the data, the database returns consistent results. The database tables have been developed to ensure that the data is entered in the database in a non-redundant (to ensure that data is not duplicated) manner. The integrity or consistency of the data in the database is protected by the use of normalization. Normalization splits related data across multiple tables, requiring queries to perform operations called joins and reassemble the data and thereby ensure the non-redundancy of the data.

Different tables have developed to store the personal information, laboratory test results, diagnosis results, and age of the baby on basis of number of visits. This is mentioned in the database schema below. The tables have been developed using Structured Query Language(SQL).

2.5. Database Schema
The schema of the tables which form the relational database at the back end is given in Table 1 (a)-(d).

Table 1(a): Stores the general values of the newborn babies. The fields in the table are explained serially

Field #

Type NULL Key Default Extra
serial_no varchar(11)   PRI    

name

varchar(255) YES   NULL  

sex

char(1) YES   NULL  

hospital

varchar(255) YES   NULL  

address

varchar(255) YES   NULL  

phone_no

varchar(30) YES   NULL  

caste

varchar(255) YES   NULL  

religion

char(1) YES   NULL  

type_of_mrg

char(1) YES   NULL  

num_of_births

smallint(6) YES   NULL  

num_of_deaths

smallint(6) YES   NULL  

num_abortions

smallint(6) YES   NULL  

num_living

smallint(6) YES   NULL  
age_of_mother smallint(6) YES   NULL  

# First field serial_no stores the unique patient number.

Second field name stores the name of the patient.

Third field sex stores the sex of the patient.

Fourth field hospital stores the name of hospital from which the baby has been referred to CDFD for screening.

Fifth and sixth fields address, and phone_no store the contact information of the baby.

The seventh field caste stores information regarding the caste of the newborn.

The eighth field religion stores information regarding the religion of the newborn.

The type_of_mrg field stores information regarding the type of marriage of the parents of the newborn. This field checks for the consanguinity.

The num_of_births field stores information about the total number of babies born in the family of newborn.

The num_of_deaths field stores information with regards to the number of newborns who died in the family of the patient (newborn).

The num_of_abortions field stores information regarding the number of abortions that have taken place in the family of the newborn.

The num_living field stores information about the number of babies who are living in the family of the newborn.

The age_of_mother field stores the age of the mother.

Table 1(b): This table stores the information regarding the test results obtained out of the screening performed for various disorders on the newborn.

Field #

Type NULL Key Default Extra

Testdate

date     0000-00-00  

serial_no

varchar(11)        

test_type

char(30)        
test_value float YES   NULL  

test_status

char(1) YES   NULL  

# The testdate field stores the date on which the newborn visits the center for screening.

The serial_no field stores the unique patient number. This is a foreign key referencing the primary key(serial_no) of the table 1.a.

The test_type field stores the name the disorder for which the newborn is screened. Eg. TSH, Biotin deficiency etc.

The test_value field stores the quantitative result of the disorder for which the newborn is screened.

The test_status field stores the information of the status of the quantitative value ie. whether it is Normal/Abnormal/Borderline.

Table 1(c): This table stores the information regarding the age of the newborns at the time they come for screening and the subsequent followup visits. It also stores the information regarding the counseling provided to them.

Field #

Type NULL Key Default Extra
testdate date     0000-00-00  
serial_no varchar(11)        

age_yrs

smallint(6) YES   NULL  
age_mths smallint(6) YES   NULL  

age_days

smallint(6) YES   NULL  

med_presc

smallint(6) YES   NULL  

dev_assess

char(255) YES   NULL  

fol_up_invest

char(255) YES   NULL  

fam_counsel

char(255) YES   NULL  

# The testdate field stores the date on which the newborn visits the center for screening.

The serial_no field stores the unique patient number. This is a foreign key referencing the primary key(serial_no) of the table 1.a.

The age_yrs field stores the age of the newborn in terms of number of years.

The age_mths field stores the age of the newborn in terms of number of months.

The age_days field stores the age of the newborn in terms of number of days.

The med_presc field stores the information regarding the medicines prescribed to the newborn after the results of screening are obtained.

The dev_assess field stores the information regarding the developmental assessment of the newborn after the results of screening are obtained.

The fol_up_invest field stores the information regarding the follow up investigations to be performed on the newborn after the results of screening are obtained.

The fam_counsel field stores information regarding the counseling provided to the family of the newborns after the results of screening are obtained.

Table 1(d): This table stores the list of authenticated users of the system and their passwords.

Field # Type NULL Key Default Extra
username varchar(30)
password varchar(20)

# The username field stores the name of the user.

The password field stores the password of the user.

The Tables 2a-b show the nature and range of the different tests conducted.

Table 2(a) Qualitative Tests

Test Values
Plasma Aminoacid Normal (reference to rf values of aminoacids)
Analysis(TLC)  
G6PD(Spot Test) Normal/Abnormal
Haemoglobin  

Sickle Normal(Homozygous)/Abnormal(Heterozygous)

 
Table 2(b) Quantitative Tests

Test Ranges
Normal Borderline Abnormal

TSH

0-10 μIU/ml 10-20 μIU/ml >20 μIU/ml
Galactose 1-6 uridyl >1.2 U/gHb 1-1.2 U/gHb <1 U/gHb
Transferase
Total Galactose <4.5 mg/dl 4.5-6.5 mg/dl >6.5 mg/dl
17OHP <65 ng/ml 65-70 ng/ml >70 ng/ml
Immuno Reactive <84 μg/L 84-100 μg/L >100 μg/L

Trypsinogen(IRT)

Bioitinidase Deficiency >0.14 OD 0.12-0.14 OD < 0.12 OD

Glucose 6 Phosphate

>3 U/gHb 2.5-3 U/gHb <2.5 U/gHb

Deficiency(G6PD)

     

(Enzyme Quantitation)

     
Plasma Aminoacid      
Analysis (Abnormal Case)      
Aspartic Acid 20-129 μm mol/l >129 m mol/l
Glutamic Acid 62-620 μm mol/l >620 μm mol/l
Hydroxy Proline 0-91 μm mol/l >91 μm mol/l
Serine 99-395 μm mol/l >395 μm mol/l
Asparagine 29-132 μm mol/l >132 μm mol/l
Glycine 230-740 μm mol/l >740 μm mol/l
Histidine 30-138 μm mol/l >138 μm mol/l
Citrulline 10-45 μm mol/l >45 μm mol/l
Threonine 90-329 μm mol/l >329μ m mol/l
Alanine 131-710 μm mol/l >710 μm mol/l
Arginine 35-214 μm mol/l >214 μm mol/l
Proline 110-417 μm mol/l >417 μm mol/l
Tyrosine 55-147 μm mol/l >147 μm mol/l
Valine 86-190 μm mol/l >190 μm mol/l
Methionine 10-60 μm mol/l >60 μm mol/l
Cystine 17-98 μm mol/l >98 μm mol/l
Isoleucine 26-91 μm mol/l >91 μm mol/l
Leucine 48-160 μm mol/l >160 μm mol/l
Phenylalanine 38-137 μm mol/l >137 μm mol/l
Tryptophan 0-60 μm mol/l >60 μm mol/l
Orthinine 48-211 μm mol/l >211 μm mol/l
Lysine 92-325 μm mol/l >325 μm mol/l

3. Benefit of the Task

This endeavor consolidates clinical information into a single view by providing a research oriented view of the data via web-based browser and immediate access to vital information about the baby and this information can be disclosed to the concerned physician, medical staff for the purpose of treatment. This information can also be given to the family of the baby. The database will supply data that becomes clinically relevant knowledge and will place comprehensive, integrated and easily accessible information at fingertips. The data in the database can be queried and the results can be subjected to statistical tests. Although currently the data querying facility has not been developed but it is planned to develop a versatile web-enabled query page using JAVA server side technologies.

4. Conclusion

It is paramount to realize the fact that screening newborns for inborn errors of metabolism, or for that matter any kind of screening for genetic disorders produces a lot of important data with clinical significance. This data is very precious for the medical and scientific fraternity and therefore needs to be stored in proper fashion to ensure its integrity, security, and safety which can be achieved by storing the data using Relational database (RDBMS) technology. The Newborn screening database development undertaken by us is an important step in this direction. We are also in the process of developing relational databases for the other screening programs in our institute and provide linkages between our databases and other genomic resources available in the world wide web. We hope that our effort will be of great help to clinicians, researchers, anthropologists and bioinformaticians.

Home  |  About Us  |  Members  | 
Links  |  What's New  |  Contact Us  |  Our History  |  IAHI  |  Mentors  |

© Indian Association for Medical Informatics,   Webmaster - Webmaster
Powered by Amlamed.com
This page last updated: July 2008