Bioactivity-Structure . Interrelation of Electronic and Information Factors of Biologically Activity of Chemical Compounds

The biological activity of chemical compounds is analyzed using electronic and information factors. We found a linear interrelation between the electronic and information factors of molecules. Moreover, these molecular factors are calculated from different principles. Electronic factor is determined by the quantum-mechanical method from the molecular pseudopotential, whereas the information factor is determined by using the information function. It is shown that these factors are separated off statistically significant bioactive chemical compounds of inactive chemicals. To determine these factors is sufficient to know only the chemical formula of molecules. We analyzed the chemical compounds for toxicity, antiradiation activity, carcinogenicity, antifungal activities. To identify biologically active chemical compounds we used the statistical conjugation method of qualitative attributes.


Introduction
The knowledge of quantitative stochastic interrelation between the chemical structure of the molecule and its physiological activities has important theoretical and practical significance.Such knowledge is essential for elucidating of the mechanism of biochemical action of molecules, as well as to improve existing products, and for the search of new drugs.However, the task is complicated by the fact that for different classes of chemical compounds there are no reliable, complete and homogeneous experimental data.Here we take into account the comments of Alexander and Bacq [1] of the importance of the primary chemical structure of the drug for the manifestation of its biological effect.The lack of reliable and complete of experimental physical-chemical and biochemical information about the chemical compounds necessitates the development of methodology for assessing the biological activity based only on the knowledge of the chemical formula of the molecule.We offer here a mathematical approach to analyze the relationship between the structure of a chemical compound and physiological response.This approach does not require of cumbersome calculations.
In this paper, we propose to evaluate the biological activities of chemical compounds by using electronic and information factors.To determine these factors is sufficient to know only the chemical formula of the drug.We suppose that the molecule has some effective electrostatic potential [2,3].There is good reason to believe that this potential may influence the processes regulating the vital functions of the biological object, and thereby determines the biological activity of chemical compounds.Within pseudopotential approach has been shown that the potential of interaction of the molecule with an external electron may be represented as follows: here functions () fr and () Fr are the corrections to the Coulomb potential, which depends on the distance r between the molecule and the electron.It is well known that the chemical properties of molecules are determined by the outer shell electrons.This approximation is called the "approximation of frozen-core."In this approximation, the electrostatic potential is determined by the first term of the equation (1).This pseudopotential was applied successfully [4] to calculate of the molecular potentials of purine and pyrimidine molecules.

Methods and Discussion
Table 1 shows two groups of bioactive and inactive of sulfur-containing chemical compounds.The first group includes chemicals having radio protective activity (higher, than 50% efficiency), and the second group is chemicals not having radio protective activity even when used at very high doses.
Analysis showed that the parameter (descriptor) Z statistically authentically separates preparations having radioprotective effect of chemical compounds that do not have a protective effect.Indeed, for effective radio protectors (dose < 1 mmol/kg), the value of the parameter Z ≤ Z (av) = 3.0.Here Z (av) is the average value of the random sample of the Table 1.We accept the parameter

Research Paper
Open Access Z (av) = Z * as threshold character.At the same time, for chemical compounds that do not have a protective effect the parameter Z > Z (av) .To confirm the validity of the statistical separation of chemical compounds into groups we use the statistical method of dichotomous signs comparison.Using the data in Table 2, we define the coefficient of association of Pearson  [7]: here q11 = 29 is the quantity of effective preparations.For these preparations the parameter Z < Z (av) ; q12 = 3 are the quantity of bioactive chemical compounds which Z ≥ Z (av) ; q22 = 23 are the quantity of inefficient chemical compounds which Z ≥ Z (av) ; q21 = 5 are the quantity of inefficient chemical compounds which Z < Z (av) (Table 2).Obviously, the classification model is better if a table close to the diagonal form.
Checking of the significance of the coefficient Φ by the criterion 2   gives the following inequality [7]: here n  11 12 22 21 q q q q    .This inequality confirms the non-randomness of the interrelation between Z and bioactivities of sulfur-containing chemical compounds.
We introduce one more descriptor to calculate of it sufficiently to know only the structural formula of a chemical compound.We define this descriptor using the methods of the information theory.It is known that a quantitative measure of the content of information in a multicomponent systems consisting of objects belonging to the same set is determined by the Shannon information function [8,9].For a discrete collection of objects, it is determined in bits units in the following way: ; ni is the quantity of atoms ith kind.Function H is an integral index of the state of the multicomponent system.The values of pi determine the share of the ith element in the entire collection of the set of elements, i.e., pi assigns the number of realizations or possible outcomes.Actually, to calculate the specific numbers of pi, we use A.N. Kolmogorov's combinatorial approach [9] for a collection of ni elements entering this set with mass pi.Function H is used for a quantitative determination of the measure of organization or diversity of multicomponent systems.The values of pi are calculated from the data on the content of atoms in molecular structure.In this case, the quantity of information in a molecule is only a function of number of the various atoms of a finite set.The less the value of the information function a multicomponent system is more diverse.
From Table 1 it follows that for efficacious radio protectors the average value of the information function is equal to ( This inequality indicates the statistical significance of differences in average values of the information functions   1).Points are the values were taken from Table 1.Line is the correlating equation (6) We shall check the validity of these classification rules for the following sulfur-containing chemical compounds: NH2CH2CH2CH2SH, NH2CH2CH2SH and NH2CH2CH2SС(=NH)NH2.These chemical compounds have significant antiradiation action.They were not included into the initial random sample.The information function (the electronic factor) has following values: H = 1.43 (Z = 2.29), 1.49 (Z = 2.36) and 1.62 (Z =2.63) bits, respectively.That is, the classification rules are executed for these sulfur-containing chemical compounds.
It is easy to show that the information function and the electronic parameter are interrelated (Figure 1).We obtained the following statistics: here A = 0.258, B = 0.527;     We now analyze the chemical compounds of the homologous series of N-substituted S-2-aminoethylthiosul fates (Table 3).These chemicals are used also as radio protectors.We estimated the information function H and the electronic factor Z for molecular structures of this series compounds.Applying the method of conjugation of qualitative attributes we can establish the relationship between the magnitude of the therapeutic index T and the values of the electronic factors and the information function (Figure 2).
We can determine the contingency coefficient (Figure 2A and Figure 2B Figure 3 shows the interrelation of the factors Z and H.We found that there exists a statistically significant linear relationship between H and Z both different classes of chemical compounds (Table 1) and homologous series of compounds (Table 3): (cr) 56;0.050.31 0.501 , 0.92 0.26.Importantly, the parameters A and B are close to each other in the correlating equations ( 6) and (7).Thus, the interrelation between the factors Z and H is very close both the homologous series of radiation protectors and nonhomologous of series chemical compounds.That is, for highly active sulfur-containing chemical compounds the regions of this interrelation are congruent.However, we should note that this technique is not adapted oneself to tryptamine derivatives [12].Apparently, this is due to the fact of the mechanism of the antiradiation action of these compounds fundamentally different from the mechanism action of sulfur-containing compounds.3. Line is the correlating equation (7) We examine the linkage between the carcinogenic action of chemical compounds with factors Z and H. Statistical analysis of the data in Table 4 it shows that by using factors H and Z we can be separated the carcinogenic chemicals from non-carcinogenic drugs.We will take the average values of these factors (H (av) = H * = 1.57bits and Z (av) = Z * = 2.95) as threshold characters.Then the method of conjugation of qualitative factors H and Z with the carcinogenic action of chemical compounds has the following values for the coefficients of association: Φ = 0.61 ( 2 2(cr) 22.3

 
) and Φ = 0.86 44.4   ), respectively.For chemical compounds of Table 4 the interrelation between factors of Z and H is linear (Figure 4): That is, the interrelation is linear.4. Line is the correlating equation ( 8)  Thus, we found that the regions of the factor variables Z and H separated the chemical compounds statistically authentically into two subsets with different biological activity.It is important to note that we have used various principles of search explanatory factors, namely quantum and information.However, we obtained close results.
Now we analyze the interrelation of antifungal activity of benzo-2,1,3-thia-and selenadiazoles derivatives with electronic and information factors (Figure 5).For this goal we use experimental data of the paper [14].The antifungal activity of benzo-2,1,3-thiadiazole derivatives [15], was detected at various test objects.However, there is not exist a common method for testing of antifungal drugs.This makes difficulties for identify the quantitative relationship between the chemical structure of compounds and their biological actions.Therefore, we apply the methodology that used above.
Antifungal Activity and toxicity test series of compounds list in Table 5.We use the following symbols: Ft is the inhibition of late blight, Ve is the inhibition of Biologically Activity of Chemical Compounds fungi species Venturia inaqualis, As is the inhibition of fungi species Aspergillus niger, Fu is the inhibition of fungi species Fusarium moniliforme.

Table 5. Antifungal activity, toxicity, the average number of outer shell electrons in the molecule and the value of information function for the benzo-2,1,3-thia-and selenadiazoles
Inhibition of growth of fungi mycelium, % (concentration is 0.003%) [14] lg(LD50) [ 5.
Using the data in Table 5 we can define a quantity of chemical compounds q22 (lower right quadrant) for which are equitable the following inequalities:   In both cases, the distinctions are statistically insignificant.Therefore, to determine the distinctions in the average value of grouping we can use the following relations: The weighted average dispersion S 2 is determined from the following relationship: ( ) / ( ). S q S q S q q    The distinction of average values is statistically significant if the difference is greater than the parameter T.
Using the equations ( 10) we find the following differences: lg LD .We verify grouped chemicals on statistical homogeneity.That is, we will find out whether there are chemical compounds whose parameters are significantly different from the mean values.We will check maximum (max) and minimum (min) values of the factors for the chemical compounds.For this goal we will use τdistribution [10]: The domain of frequencies q11: activity is < 50%, H < H * , Z < Z * ; the domain of frequencies q22: activity is > 50%, H > H * , Z > Z * ; the domain of frequencies q21: activity is > 50%, H < H * , Z < Z * ; the domain of frequencies q12: activity is < 50%, H > H * , Z > Z * .We have revealed the interrelation of bioactivities with a single parameter Z (or the information function H).This allows us to suggest an interrelation between bioactivities of the chemical compounds.The results in Table 6 indicate that the bioactivities of these chemical compounds are interlinked with the factors Z and H.That is, these factors are statistically alternative.
By using the data in Table 5 it can be shown that the information function H is linearly related to the factor Z (Figure 8).The correlations are linear and positive for all sulfur-containing radio protectors, carcinogenic chemical compounds and antifungal drugs.The slope of the correlation line lies in a narrow range from 0.30 to 0.53 indicating that the interrelation is not random but is inherent in the different molecular structures.The magnitude of this slope is a measure of the mutual coupling of factors H and Z. Chemicals of No. 50-52 were not considered in constructing of the correlation equation since the parameters Z and H for these compounds do not satisfy the condition of the homogeneity.Heterogeneity occurs when the quantity of atoms in the substituent Ri exceeds the quantity of atoms of the base molecule.we can presume that there exists a statistically significant the interrelation of bioactivities.Threshold processes of biological activity are characterized by the following principle: as long as the descriptor is not reached of the threshold value, bioactivities of the chemical compounds probably negligible.At the same time if the magnitude of the descriptor lies beyond the threshold value, the biological effect of the drug is close to the maximum value.Thus, without a complete set of experimental physical and chemical data on chemical compounds, we can set a statistically significant correlation between molecular structure of a chemicals and bioactivities of the molecules, as well as to predict the bioactivity of new chemical compounds.

Conclusion
It is also possible propagation of this approach to modeling bioactivity for other classes of chemical compounds.The fact that the factors Z and H are not abstractions, but they have a physical meaning, which is associated with such properties of molecules as the molecular pseudopotential, the hydrophobic of molecules [3], the total electronic energy of the molecules.If these properties of the chemicals are crucial in the mechanism of bioactivities, we can expect some commonality of the proposed approach to assess the interrelation bioactivity of chemical compounds with their molecular structure.The positive linear interrelation between the factors Z and H is not a random.As is known, the information function determines the diversity of the molecular structure.In turn, the molecular structure is determined by the quantity of different atoms that form the associated molecular formation.At the same time, the structure of the molecule is not an arbitrary number of different atoms.The structure of the molecule is determined by the quantity of valence electrons in the outer shell of an atom.Apparently, this quantum-chemical property of molecular compounds establishes the linear interrelation between the factors Z and H.

1 ;
N is the discrete number of objects (atoms) of the set, which determine the space of possible values of / ii p n N 

Figure 1 .
Figure 1.Field of correlation and scatter diagram of the electronic factor and the information factor for sulfur-containing chemical compounds (Table1).Points are the values were taken from Table1.Line is the correlating equation(6) ) using the threshold characters: Z* = 2.75 и H* = 1.7 bits.As a result, we obtained Φ = 02B.These coefficients indicate the statistical significance of the association of the chemical compounds bioactivity with factors of Z and H.

Figure 2 .
Figure 2. Distribution of the chemical compounds into the quadrants of fourfold table(Table 2). A. Interrelation of the therapeutic index T and the electronic factor Z, B. Interrelation of the therapeutic index T and the information function H

Figure 3 .
Figure 3. Field of correlations and scatter diagram for the information function and the electronic factor.Points are H and Z values are taken from Table3.Line is the correlating equation(7) cr

Figure 4 .
Figure 4. Field of correlation and scatter diagram of the electronic and the information factors for carcinogenicity.Numerical values of points were taken from Table4.Line is the correlating equation(8)

Figure
Figure 6A and Figure 6B represent the diagrams of the distribution of chemical compounds of Table5.Using the data in Table5we can define a quantity of chemical compounds q22 (lower right quadrant) for which are equitable the following inequalities: Z > Z * .The total quantity of chemical compounds, which fall into this area are equal to q22 = 8 (Figure6A).Average value of toxicity and the value of the electronic parameter are equal to * (upper left quadrant of Figure6A) we have the following statistics: 254.The total quantity of the chemical compounds in this quadrant is equal to q11 = 15.We verify the statistical significance of distinction in average values of the parameters that define the alternative grouping.Preliminarily we will define the statistical significance of distinction of the variances.For this goal we use the Fdistribution of Fisher:

Figure 6 .
Figure 6.Pictorial representation of the conjugation between toxicity of chemicals and values of the factors Z (A) and H (B) and the chemical compounds with low toxicity are grouped in two distinct domains of plane Z, 50

Figure 8 .
Figure 8. Field of correlation and scatter chart of factors H and Z for a series of benzo-2,1,3-thia-and selenadiazole derivatives.Points are the values were taken from Table 5. Line is the correlating equation: H(Z) = 0.825 + 0.300•Z; R = 0.84 > (cr) 60;0.05R

Table 1
presents the radio protectors, which belong to different chemical classes of sulfur-containing compounds.

Table 4 .
Electronic and information factors of cancerogenic chemical compounds