2. Principle of Discrimination of Two Group Application of a PDF density by contrast with the PDF is an essential failure of the approach set forth in (2). An advantage of integral PDF results from Glivenko-Kanteli theorem, according to which an empirical PDF with probability 1 is approaching an actual PDF. There is no similar theorem for the density, and thus defining empiric PDF density is more complicated task since the precision of its simplest non-parametrical estimates (bar chart, polygon of frequencies) are depending on dividing an empiric parameter range into intervals. With deviation from normality empiric estimates (descriptive statistics) are much dependant on the PDF density form. E.g., whereas estimation of a difference of the average values of the two groups becomes unstable, hence it is necessary to turn to describing a difference of medians. In this connection authors believe that application of integral PDF is a means of noise regularization (authors believe it is the external non-informative factors – see Introduction).
336 Intelligent Systems Fig.1 Recognition between two groups according to probability (left) and statistical (right) decision theories.
Fig. 1 shows schematic difference between the probability theory and statistic theory based on binary DR. In general, dividing two groups may result in 4 classes. Expression (3) describes it in terms of the probability theory and (4-5) for binary DR.
(3) P(1)=Р(А\В)=Р(А)–Р(АВ); P(2)=Р(B\A)=Р(В)–Р(ВА); P(3)=Р(BA), P(4)=P(not (A or B))=1–Р(AВ).
Р(1) = Р(TN) = a/N; Р(2) = Р(TP)= b/N, P(3)=P(False)=Р+Р, Р = c/N, Р = d/N. (4) 0, within Discrimination problem, because P(1)+P(2)+P(3)=P(4) = (5) P(Other) or P(Undefined) within Classification problem Comparing (4) and (3) we see that DMA based on binary DR brought about the loss of symmetry which occurs in the probability theory, i.e. P(AB)=P(BA) since the probabilities FN and FP of the classification are different in general Р Р; because сd. It is thus necessary to develop DMA so as to find the threshold Х which equals OPT the probabilities of errors of both kinds. It corresponds to the line Х is changed with some optimal line, in СR general a curve Х which splits a space of both events so as their resulting spaces would be equal. The ОPT obtaining procedure with applying integral PDF instead of differential PDF is shown below.
3. Approach Based on Integral PDF So, task to be solved consists in unification of data processing so that comparison DM quality between two groups, which are non-homogeneous with respect to non-controlled disturbed factors, have been made. Three methods of discrimination of two groups were compared:
1) traditional method based on differential PDF;
2) method based on non-normalized and non-smoothed integral PDF;
3) method based on standardized (normalized and smoothed integral) PDF.
Experimental data include 151 examined patients formed two groups: 70 healthy persons and 81 CAD pts with unchanged ECG at rest. Well-known, that CAD is the most widespread heart disease and reliable CAD diagnostics is still actual problem. The reason consists in that rest ECG is normal, EchoCG and MRT is mainly limited by morphological heart structure (but did not links with electrophysiology), and various stress tests can be prone to risks.
From the other hand, Magnetocardiography (MCG) is a method of non-invasive recording and analysis of the magnetic field of the heart, arising due to its electrical activity. The advantages of the MCG-mapping consist in: 1) localization of electrical source into myocardium, 2) high sensitivity to various pathological disturbances, 3) revealing of disease at early stages and silent forms, 4) safety . Last decade considerable efforts are directed toward the studying of MCG to CAD diagnostics .
XII-th International Conference "Knowledge - Dialogue - Solution" MCG observations were performed in Biomagnetic Lab (Institute for Cardiology, Kyiv) by 4-channel magnetocardiograph at unshielded environment  according to standard method . Medical analysis was carried out with help of 9 numerical MCG indexes reflecting spatial-temporal structure of magnetic field, see for detail [1,3]. The control group consisted of volunteers with no indications for cardiac diseases found by routine clinical methods . MCG was recorded with a 7-channel magneto-cardiograph CARDIOMAG (Glushkov Institute for Cybernetics, Kyiv, Ukraine) . Additionally, routine clinical examinations (ECG, EhoCG, bicycle test) were conducted. The results of discrimination are presented in Tbl.1 and Fig.2.
Table 1 Results of DPS of two groups based on integral PDF and binary DR Non-normalized and non-smoothed PDF Standardised PDF Index M M X SP SN NPV PPV V X V Hel CAD CR AVE CR Nst 3,68 12,75 7 69% 64% 62% 71% 66,8% 6,37 0,311 68,9% MR 7,75 11,09 10 67% 67% 64% 69% 66,6% 9,61 0,314 68,6% IFV 1,9 2,86 2,55 67% 66% 63% 69% 66,2% 2,34 0,343 65,7% IFH 4,85 6,91 6,2 64% 61% 59% 66% 62,5% 5,84 0,351 64,9% Y 10,84 12,74 12,1 58% 69% 62% 65% 63,5% 11,86 0,392 60,8% Nj 3,5 4,16 4 51% 66% 56% 61% 58,6% 3,84 0,414 58,6% Evaluation of diagnostic valuable of indexes were performed with help of “average value” V and value V (3) if AVE non- or normalized PDFs are used, respectively. Here is probability of classification errors (within framework of this method errors of both kinds are equal).
V =(SP+SN+NPV+PPV)/4, V = 1 –. (6) AVE Fig. 2 Discrimination of two groups based on smoothed PDF for quasi-gauss (left) and non-gauss (right) case.
Table1 shows that both FOMs V and V are very close so that ordering of best parameters according to any of AVE them are practically identical. It means that method, based on standardized PDF, is insensitive to any disturbed factors and value V should be considered as adequate and stable FOM for quality (power, reliability) of discrimination problem solution (DPS). Otherwords, method of DPS, utilising normalised PDF, may be used in medical statistics if real data are blurrred, non-gauss, and asymmetrical .
This approach being optimal is confirmed by the ROC-analysis (Fig.3). It is seen that there is an optimal value of prevalence when Sp=Sn (point С, Fig.3). Yet the best division of two groups is accomplished for equal groups, i.e. n=1 (point D, Fig.3), although in such case the error probabilities will differ SpSn. In order to reach maximum with Sp=Sn errors of the 1st and 2nd kind should be equal. Consequently, it is necessary to look for the DPS 338 Intelligent Systems technique so as both errors would be equal. From Fig. 4 it is seen that applying the normalized PDF automatically ensure the equality = та Sp=Sn (i.e. points C and D are converged). The advantage here is that such technique does not need equal groups in practice since it is done in a pure mathematic way – by normalizing the empiric PDF.
Fig. 3. Parametric dependence of SN from SP, so called Fig. 4. Dependence of Sn and Sn from threshold ROC curve based on using of non-normalized PDF of binary DR for normally distributed groups.
4. Generalized Parameters To implement some DMA in practice, it is necessary to develop a method to DPS allowing to select reliable discriminant parameters having too large (80% and above) FOM, i.e. V according to Sec.1.3. But from Tbl.1 we can see that anyone single parameter has no high value, because V<70%. Low reliability of CAD recognition necessitates the use some generalized (integral) indices instead particular ones. For this purpose, we employed linear discriminant analysis (LDA) and studied the dependence of the discrimination power on various combinations of MCG indices in the LDA space . In such a way so-called cumulative parameters have been introduced into consideration. Cumulative i-th parameter Z is sum of first particular parameters, arranged into i descending order by significance level of two-population T-test.
Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 Z10 ZA) B) Fig.5. Dependencies of LDA power for particular (a) and cumulative (b) parameters (- healthy, – CAD).
Firstly it was calculated 10 MCG parameters MR, IFH, IFV, N L, AFV, AFH, Y, N, N, characterizing IHD .
ST, ST T J The best 6 of them are Х1-Х6 (N MR, IFH, IFV, L, Y) for which р<0.01 is shown in Fig. 5а. From the figure it is st, st seen that the dependence of discrimination power is irregular since different are the force and nature of the impact of external uncontrolled factors on statistic distribution of each parameter. On the other hand, for cumulative parameters discrimination power is gradually increasing for both healthy individuals and IHD patients (Fig. 5b). Consequently, 6 variables suffice for the best division of healthy individuals/IHD patients based on MCG XII-th International Conference "Knowledge - Dialogue - Solution" indexes, since the curve maximum for healthy individuals (81.7%) and local maximum for IHD patients 75.3%) are seen with the same 6 parameters.
Yet ordering of 6 indexes according to the T-test in cumulative order is not optimal if PDF are much different from Gauss form. Therefore all possible combinations C, C, C, C were studied. Results are set forth in Table 2.
62 63 64 Table 2. Optimized order in sets of parameters presenting the best division of groups.
Number of parameters into set Group 1 2 3 4 Healthy N N MR- Y-N MR-ST-N N ST ST-ST ST ST-Y ST-Y-ST-IFH-MR Power 76,05 80,28 80,28 80,28 81,CAD IFH MR-IFV MR-IFV-N MR-N ST ST-ST-IFV N ST-Y-ST-IFH-MR Power 71,6 71,6 70,37 75,3 75,It is seen from table 2 that the best power with the set of 5 parameters is increasing for Healthy individuals from 77.46% to 81.7%, for IHD patients – from 74.07% to 75.3%. As a result of combinatory search of about parameters the LDA power corresponding 6 parameters ordered in a nonoptimal way was obtained Cumulative parameter formed by 5 parameters having maximal discrimination power for CAD pts against Healthy (75%), and for healthy against CAD (82%) was founded .
Thus non-homogeneity of groups was actually removed from the LDA viewpoint. Identity of set of 5 parameters proves homogeneity of both groups in terms of a discrimination task. Here worsening the quality of discrimination caused by the effect of external uncontrolled factors is the lowest for both groups simultaneously. Of course, the values of maximal LDA powers are not equal since the forces of effect of the said factors on each group are different.
5. Calculation of Risk (Intermediate) Zone based on Fuzzy Logic In using a threshold DR the DPS error increases for persons whose parameter value is close to the threshold, i.e.
on the border where training statistic samples overlap. It especially affects the quality of DPS for the research methods sensitive to the impact of external non-informative factors . It is necessary in this connection to identify an intermediate zone of the parameter corresponding to the risk of a certain diseases. An example of an IHD risk zone identified by means of MCG parameters, whose PDF is shown in Fig. 2, is set forth in Fig. 6. It is seen that for quasi and for essentially non-Gauss distribution a risk zone is determined unambiguously because of PDF graphs are monotonous.
A) B) Fig.6. Determination of intermediate zone for parameters MP (a) and Nst (b) in which risk of CAD is predicted.
340 Intelligent Systems This approach is based on the transition from the PDF to membership functions (MF) of F applied in fuzzy logic which are defined as follows:
F(1)= Р(TN) - Р; F(2) = Р(TP) - Р, F(Intermediate)=F(Risk)+F(Undefined), (7) F(Risk)=F(3)=2*P(False)=2*(Р + Р), F(Undefined)=F(4)=1- [F(1)+F(2)+F(3)]. (8) Conclusion In result, method to find numerical informative parameters, i.e. identification procedure describing some complex living object. For example, parameters, containing diagnostic information about the failures of electric processes into the human heart, have been considered. Such parameters have been studied at of patients with coronary artery disease (CAD) by means of method of magnetocardiographic (MCG) mapping. High sensitivity of the MCG leads to essential influences of non-controlled external factors resulting in increasing of blurring of observed groups. That is why selecting and ordering of reliable MCG indexes are too complex task. In order to improve quality of data processing, an approach, based on integral probability distribution function (PDF), have been proposed. In result, single figure-of-merit (FOM), so-called “Significance”, instead 4 ones (sensitivity Sn, specificity Sp, negative (NPV) and positive (PPV) predictive value) have been found. A selection procedure for generalized parameters with LDA involved was proposed and intermediate zone, i.e. the range of parameter values where there is the risk of disease, was determined.
Acknowledgements Results have been partially completed owing to financial support of the Science and Technology Center in Ukraine (STCU) under the project 2187. Authors also would like to express their frankly gratitude to Prof. V.P. Gladun (Kyiv, Ukraine) for help and discussion.
Bibliography  Diagnostic criteria for chronic coronary artery disease based on registry and analyses of the magnetocardiograms /Budnyk M, Chaikovsky I. Kozlovsky V et al // Preprint 2002-5, Institute for Cybernetics, Kyiv (Ukraine), 2002.
 Low-cost 7-Channel Magnetocardiographic System for Unshielded environment / Budnyk M, Voytovych I, Minov Yu et al // Neurology and Clinical Neurophysiology, 2004:112:1-7. http://www.ncnpjournal.com  Evaluation of Magnetocardiography Indices in Patients with Cardiac Diseases / Budnyk M, Kozlovsky V, Stadnyuk L et al // Neurology and Clinical Neurophysiology, 2004:111:1-6. http://www.ncnpjournal.com  Supersensitive MCG system for early identification and monitoring of heart diseases (medical application) / Voytovych I, Kozlovsky V, Budnyk M et al // Control Systems and Machines, 2005, No 3, p.50-62.
 Budnyk M., Zakorcheny O., Ryzhenko T., Zholob V., Evaluation of valuable numerical indexes derived from MCG data.
In: Proc. Intern. Conference ICAP’2005, Kyiv (Ukraine), 2005, p. 170-171.
 Chernysheva D, Budnyk M. Using of discriminant analysis for processing of magnetocardiographic information.
In: Computer tools, system and nets. Ed. O.V.Palagin Glushkov Institute for Cybernetics. Kyiv (Ukraine), 2004.
Материалы этого сайта размещены для ознакомления, все права принадлежат их авторам.
Если Вы не согласны с тем, что Ваш материал размещён на этом сайте, пожалуйста, напишите нам, мы в течении 1-2 рабочих дней удалим его.