Artificial Neural Network (ANN) Approach for Predicting Cu Concentration in Drinking Water of Chahnimeh1 Reservoir in Sistan-Balochistan, Iran


Alireza Shakeri Abdolmaleki 1 , Ahmad Gholamalizadeh Ahangar 2 , * , Jaber Soltani 3


1 Department of Water Engineering, Faculty of Soil and Water, University of Zabol, Zabol, IR Iran

2 Department of Soil Sciences, Faculty of Soil and Water, University of Zabol, Zabol, IR Iran

3 Department of Water Engineering, Abureyhan Campus, University of Tehran,Tehran, Ir Iran


Health Scope: 2 (1); 31-38
Published Online: May 11, 2013
Article Type: Research Article
Received: December 17, 2012
Revised: February 26, 2013
Accepted: March 6, 2013




Background: Access to safe drinking water is one of the basic human rights and essential for healthy life. Concerns about the effects of copper on human health have led to numerous guidelines and regulations limiting its concentrations in water.

Objectives: The major goal of this study is to demonstrate artificial neural network model of the Chahnimeh1 reservoir water quality (Heavy metal concentration) and show the potential of the ANN for producing models capable of efficient forecasting of Cu concentration.

Materials and Methods: Water samples were collected from Chahnimeh1 reservoir which was the most important source of drinking water in Sistan-balochistan and analyzed for physical quality parameters such as: EC (electric conductivity), TDS(total dissolved solids), T(temperature), pH and heavy metal (Cu) concentration using standard methods. In this study, a three-layer artificial neural network (ANN) model was investigated to predict the Cu concentration in the water of Chahnimeh1 reservoir. The input variables are electric conductivity, total dissolved solids, temperature and pH, while the Cu concentration in water is the output. We applied The Levenberg–Marquardt (LM) algorithm to train ANN.

Results: According to the ANN outputs, hidden layer with 7 neurons had the best performance for predicting Cu concentration. Evaluation indexes including MSE and R in this article were obtained as 0.00008 and 0.9346; 0.00019 and 0.8612; 0.00014 and 0.9372 for training, validation and testing date sets respectively.

Conclusions: As we can see the ANN outputs values are very close to actual Cu concentration, so indicating that predicted values are accurate and the network design is proper and the input variables well suitable for the prediction of Cu concentration.


Neural Networks Drinking Water Iran

Copyright © 2013, Health Promotion Research Center. This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License ( which permits copy and redistribute the material just in noncommercial usages, provided the original work is properly cited.
1. Background

Water quality (WQ) is a description of biological, chemical, and physical characteristics of water in connection with intended use(s) and a set of standards (1-3). Hence, water quality assessment can be defined as the evaluation of the biological, chemical and physical properties of water in reference to natural quality, human health effects, and intended uses (4, 5). Heavy metal pollution leads to serious human health hazards through the food chain and the loss of biodiversity, and harms the environmental quality. Recent researches into trace elements and heavy metals show highly interesting records (6). Escalated anthropogenic activities in the basins and reduced river discharges registered during the last few decades have caused increase in the organic and inorganic pollution load of the surface water body (7). Copper is essential for good health. However, exposure to higher doses can be fatal. Long term exposure to copper results in nose irritation, mouth, and eyes, and cause headache, and diarrhea (8). Copper status has also been associated indirectly with a number of neurological disorders, including Alzheimer’s disease and prion diseases, including bovine spongiform encephalopathy. Exposure of humans to copper occurs primarily from the consumption of food and drinking water. The relative copper intake from food versus water depends on geographical location; generally, about 20–25% of copper intake comes from drinking water (9).

Nonetheless, the WQ can be evaluated by a single parameter for certain objective or by a number of critical parameters selected carefully to represent the pollution level of the water body of concern and reflect its overall WQ status. However, since no individual parameter can express the WQ sufficiently, the WQ is normally assessed by measuring a broad range of parameters (temperature; pH; electric conductivity (EC); total dissolved solids (TDS); and the concentrations of the heavy metals).Although, parametric statistical and deterministic models have been traditional way for modeling the water quality, but these require vast information on various hydro logical sub-processes in order to arrive the end results. In recent years, several researches have been conducted on water quality forecast models (10, 11). However, since a large number of factors affecting the water quality have a complicated non-linear relation with the variables; traditional data processing methods are no longer good enough for solving the problem (12, 13). On the other hand, the artificial neural networks (ANNs) capable of imitating the basic characteristics of the human brain like as self-organization, self-adaptability, and error tolerant and have been widely adopted for model identification, analysis and forecast, system recognition and design optimization (14). Many statistical based water quality models, assume the relationship between variables to be linear and the distribution of those to be normal, however, ANNs has the ability of indicating non-linear relationship between variables(15). In recent years, ANNs were applied for modeling various kinds of research topics (16-18).

ANN models have been successfully employed to the water quality prediction in reservoir, stream and groundwater(19-22).Artificial neural network (ANN) modeling has the potential to reduce computation time and effort and possibility of errors in the calculation.

The major goal of this study is to demonstrate the artificial neural network model of the Chahnimeh1 reservoir water quality (Heavy metal concentration) and show the potential of the ANN for producing models capable of efficient forecasting of Cu concentration. Here, we have investigated the possibility of training. The Cu concentration of the reservoir water was taken as the dependent variables here and the independent variables contain physical water quality data sets. In this paper, ANN models have been identified for computing the concentration of Cu in the reservoir water.

1.1. Artificial Neural Networks Modeling

The artificial neural network is a useful computational way for predicting and modeling abstruse relationships among parameters, especially when there is no explicit relation among parameters (23, 24). The structure of artificial neural network basically consists of three layers, the input layer that all the data are imported to the network and calculation of the weight of each input variables are done, the hidden layer or layers whose data are computed, and the output layer, that the artificial neural network results are obtained. Every single layer includes one or more fundamental section (s) called a node or a neuron (25). The problem is the key factor that can be determined in the number of neurons in the layers. The small number of hidden neurons is a limiting factor to learn the process carefully, even though too high number scan be very time consuming, and the network may over fit the data (26).

In this study, three-layer neural networks were constructed for computation of the reservoir metal concentration (Cu). All the computations were performed using the EXCEL 2003 and MATLAB (Version 7.12, MathWorks, Inc., Natwick, MA).

1.2. ANN Description

The network includes an input layer, hidden layers and an output layer. The inputs for the network include water temperature; pH; electric conductivity (EC); total dissolved solids (TDS). The scaled values have been passed into the input layer and after that, propagated from the input layer to the next layer which is called hidden layer, before reaching the output layer of the network (27). Each node in both hidden and output layers in the first place will act as a summing junction with the use of the following equation inputs combined and modified from the previous layer (28). The yi is the net input to node j in hidden or output layer, the weight related to neuron i and neuron j are indicated as wij, xi is the input of neuron j, bj is the bias connected to node j (29). Sigmoidal transfer function is usuallyused for nonlinear relationship (30, 31). The general form of this function is showed below (28):

Where Zj is the output of node j, the sigmoidal function is between 0 and 1, thus the input as well as output data should be normalized to the range between 0 and 1 (31). Hence, normalization of values within a uniform range is vital to prevent data with larger magnitude from overriding the smaller ones. In the present work, scaling of the data to the range of 0–1 was carried out as follows (32): Where is the normalized value, X, and are (R?) the actual, the maximum and the minimum value of data sets respectively.

2. Objectives

The aim of testing various network designs was to find weight values of the network with minimum error (28). The indexes that showed efficiency of network were MSE and R(33):where n is the number of points, is the output value got from the neural network model, is the experimental value, and is the average of the experimental values.

3. Materials and Methods
3.1. Study Area

Sistan-Balochistan is one of the 31 provinces of Iran and is the first largest state in terms of land area (187502 km2). It is bordered by Khorasan province from the north, Afghanistan and Pakistan from the east, Oman Sea from the south and the Kerman province from the west. Chahnimeh1 reservoir is located in the Sistan-Balochistan region that has a series of natural depressions used primarily to store water for irrigation and public water supply (34).They are not only used to store water, but also have profound impact on prevent floods. During periods of high flows, water is diverted to these reservoirs via an intake and canal which has a capacity of up to 1000 m3/s (Figure 1).

Map Showing the Geographical Setting of the Chahnimeh1 Reservoir
Figure 1. Map Showing the Geographical Setting of the Chahnimeh1 Reservoir
2.2. Water Quality Data Set

The data set used in this study was generated through measuring water quality of Chahnimeh1 in Sistan-Balochistan. The sampling sites are spread over a distance of about 20kmand 2 water samples were collected in spring 2011. Grabbed water samples were collected from two depth of water column (20 cm below the surface and bottom of the reservoir) (35). All the water samples collected during the study period were analyzed for temperature; pH; electric conductivity (EC); total dissolved solids (TDS); and the Cu concentrations.

2.3. Water Analysis Methods

For analysis total copper of all water samples must be acidified at the time of collection with HNO3 (5 mL L-1). Transfer a 100-mL aliquot of well-mixed sample to a beaker then add 2 mL of concentrated HNO3 and 5 mL of concentrated HCl. The sample is covered with a ribbed watch glass or other suitable covers and heated on a steam bath, hot plate or other heating source at 90 to 95 C until the volume has been reduced to 15-20 mL, then adjust the final volume to 100 mL with reagent water and read the concentration of Cu by flame atomic absorption spectroscopy (36). Physical water quality parameter including (EC, T, TDS and pH) is determined with portable tool (HANNA instrument model: HI 98129) (37).

4. Results
4.1. ANN Modeling

To enable modeling of nonlinear and complicated functions, feed-forward neural network has been used with one or more hidden layers (28). Nevertheless, it is very hard to choose the number of hidden layers (30). Most of literatures indicate that one hidden layer is good enough to validate the prediction and maybe the best to decide for all applied feed-forward network design (38). Thus, in this paper one hidden layer was used for modeling (Figure 2).

Schematic Representation of the ANN Architecture Used in This Study
Figure 2. Schematic Representation of the ANN Architecture Used in This Study

It is crucial to highlight the determination of the number of neurons in hidden layers. Neurons played an important role that effected on the general characteristics of network and training time (27). The complexity of relationship among parameters determined the number of neuron in the hidden layer (39). The optimum number of neurons in a hidden layer was found by trial and error.

The network has trained with various kinds of learning algorithms but knowing about their suitability and fitness is not an easy task to do. In our research, Levenberg–Marquardt back propagation (LM) algorithm applied in order to train ANN. The LM is an approximation to the Newton’s method (40).The results show that the best performance of network design included 3 different layers: input layer, hidden layer consist of 7 nodes on it as well as output layer. The evaluation indexes MSE and R calculated 0.00008 and 0.9346;0.00019 and 0.8612 for training set; 0.000014 and 0.7348; 0.0004 and 0.9372 for validation set, testing set and all data set respectively. Prediction accuracy of all data set (training, validation and testing) is high enough as shown in Table 1. Predicted values of the best model also showed in Table 1. Table 2 shows that values of MSE are very little indicating that predicted and actual values were close enough.

Table 1. Experimental Values (training, Validation and Testing Data set), Actual and Model Predicted of Cu Concentration in Water of CHAHNIMEH1 Reservoir
No. EC, µmhos cm-1 TDS, mg l-1 T, ᵒC pH Cu concentration, mg l-1
Actual Predicted
Training Set
1 672 345 24.0 8.20 0.02623 0.02621
2 669 334 23.0 8.24 0.00739 0.00436
3 689 341 22.7 8.24 0.01883 0.02270
4 693 326 22.5 8.23 0.02500 0.02070
5 675 337 22.6 8.20 0.00366 0.00820
6 703 352 22.3 8.20 0.03400 0.02764
7 687 343 22.1 8.24 0.02147 0.02348
8 683 336 24.0 8.24 0.00997 0.01142
9 690 339 22.0 8.10 0.01898 0.01889
10 670 350 23.5 8.26 0.02400 0.02467
11 704 342 22.3 8.10 0.02090 0.01633
12 673 336 24.2 8.22 0.02311 0.02211
13 690 345 22.2 8.10 0.01902 0.02315
14 716 357 21.7 8.00 0.02630 0.02431
15 920 465 15.6 7.85 0.00471 0.00489
16 920 465 16.5 7.84 0.03599 0.03542
17 929 468 16.7 7.90 0.02284 0.02275
18 700 354 22.0 8.17 0.01400 0.01800
19 676 343 22.6 8.18 0.02488 0.01970
20 720 342 23.0 8.10 0.01000 0.01230
21 705 343 21.0 8.20 0.01965 0.01983
22 700 467 17.1 7.91 0.02569 0.02589
23 815 351 21.7 8.00 0.02247 0.02390
24 810 343 22.6 8.18 0.02448 0.02458
Validation Set
25 694 345 23.0 8.20 0.01258 0.01490
26 679 340 22.6 8.20 0.07470 0.09880
27 677 343 22.1 8.17 0.02285 0.01609
Testing Set
28 680 336 24.2 8.22 0.02377 0.01819
29 685 337 23.8 8.25 0.01866 0.01534
30 700 350 23.0 7.80 0.02788 0.02730
Table 2. ANN Model Performance Values for Training; Validation; Testing and All Data Sets
Training Set0.93460.00008
Validation Set0.86120.00019
Testing Set0.93720.00014
All Data0.91840.00010
5. Discussion
5.1. Sensitivity Analysis

Sensitivity analysis is a tool to determine how “sensitive” a model is to changes in the parameters of the model as well as its structure. In this part, parameter sensitivity is a center of attention. Parameters sensitivity in this research performed as a series of tests on ANN model to understand how a change in the parameter can lead to a change in evaluation indexes including both MSE and R. Sensitivity analyses of all possible composition variables were done. The optimum network with lowest MSE and highest coefficient of correlation (R) was in group of four variables. P1, electric conductivity (EC); P2, total dissolved solids (TDS); P3, temperature (T); and P4, pH are the input variables. Table 3 shows the results of Sensitivity analysis. As it can be seen in Table 3, in the group of 1 variable p3 (T) is the most effective parameter because of its best evaluation indexes (lowest MSE and highest R), according to the Table 3, the best MSE and R were obtained with combination of P2 (TDS) and P3 (T). The best group of 3 variables were obtained with interaction of best group of two variables (P2 + P3) and applied with P1 (EC). The values of MSE and R become better in the case the interaction of P1 + P2 + P3 combined with P4 to access the optimum values of MSE and R.

Table 3. Performance Evaluation of Interactions of Input Variables for the LMA With 7 Neurons in the Hidden Layer for Sensitivity Analysis
Group of One Variables
1P1a0.001490.4427y = 0.20x + 0.015
2P2a0.001420.4907y = 0.21x + 0.015
3P3a0.001410.6694y = 0.51x + 0.010
4P4a0.010800.6967y = 0.51x + 0.0094
Group of One Variables
5P1 + P20.001420.5775y = 0.37x + 0.012
6P1 + P30.001420.7531y = 0.55x + 0.0077
7P1 + P40.008280.6788y = 0.50x + 0.0120
8P2 + P30.001410.8607y = 0.70x + 0.0055
9P2 + P40.009120.7571y = 0.59x + 0.0079
10P3 + P40.014450.8446y = 0.94x + 0.00036
Group of Three Variables
11P1+ P2+P30.001420.8149y = 0.63x + 0.0079
12P1 + P2 + P40.001540.6452y = 0.38x + 0.0130
13P1 + P3 + P40.009200.8342y = 0.72x +0.0057
14P2 + P3 + P40.011720.7924y = 0.64x + 0.0075
Group of Four Variables
15P1 + P2 + P3 + P40.000100.9184y = 0.81x + 0.0033

aAbbreviations: P1, Electric conductivity; P2, Total dissolved solids; P3, temperature; P4, pH

The experimental data and ANN modeling prediction were juxtaposed in Figure 3. According to Figure 3, excellent agreement between experimental data and ANN results was indicated.

Comparing ANN Output and Experimental Data for Cu Concentration in Chahnimeh1 Water for (a) Training, (b) Testing, (c) Validation and (d) all Data Sets
Figure 3. Comparing ANN Output and Experimental Data for Cu Concentration in Chahnimeh1 Water for (a) Training, (b) Testing, (c) Validation and (d) all Data Sets
6. Conclusions

In this research, ANN was used for prediction of Cu concentration in the water of the Chahnimeh1 reservoir (Iran). The identified models were trained, validated and tested on Cu concentration measured during spring 2011. The network designs including 4 input variables, 7 hidden neurons and 1 output neuron were found to be suitable for this study. We propose the neural network as effective tool for the computation of reservoir water quality and it could also be used in other areas to improve the understanding of reservoir pollution indexes. The ANN can be seen as a powerful predictive alternative to traditional modeling techniques.

  • 1. Boyacioglu H. Development of a water quality index based on a European classification scheme. Water Sa. 2009; 33(1)
  • 2. Khalil B, Ouarda T, St-Hilaire A. Estimation of water quality characteristics at ungauged sites using artificial neural networks and canonical correlation analysis. J Hydro. 2011; 405(3) : 277 -87
  • 3. Liou SM, Lo SL, Wang SH. A generalized water quality index for Taiwan. Environ Monit Assess. 2004; 96(1-3) : 35 -52 [PubMed]
  • 4. Fernández N, Ramírez A, Solano F. Revista Bistua. Physicochemical water quality Indices-A comparative review. 2004; 1(1) : 19 -30
  • 5. Flores JC. Comments to the use of water quality indices to verify the impact of Cordoba City (Argentina) on Suquia river. Water Res. 2002; 36(18) : 4664 -6 [PubMed]
  • 6. Zhang W, Feng H, Chang J, Qu J, Xie H, Yu L. Heavy metal contamination in surface sediments of Yangtze River intertidal zone: an assessment from different indexes. Environ Pollut. 2009; 157(5) : 1533 -43 [DOI][PubMed]
  • 7. Singh KP, Malik A, Mohan D, Sinha S. Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)--a case study. Water Res. 2004; 38(18) : 3980 -92 [DOI][PubMed]
  • 8. Finkelman RB. Health benefits of geologic materials and geologic processes. Int J Environ Res Public Health. 2006; 3(4) : 338 -42 [PubMed]
  • 9. Llanos RM, Mercer JF. The molecular basis of copper homeostasis copper-related disorders. DNA Cell Biol. 2002; 21(4) : 259 -70 [DOI][PubMed]
  • 10. Chen J, Chang N, Shieh W. Assessing wastewater reclamation potential by neural network model. Engin Appl Sartif Intel. 2003; 16(2) : 149 -57
  • 11. Kurunç A, Yürekli K, Çevik O. Performance of two stochastic approaches for forecasting water quality and streamflow data from Yeşilιrmak River, Turkey. Env Model Software. 2005; 20(9) : 1195 -200
  • 12. Wu HJ, Lin ZY, Gao SL. The application of artificial neural networks in the resources and environment. Resources and Environment in the Yangtze Basin. 2000; 9(2) : 241 -6
  • 13. Xiang S, Liu Z, Ma L. Study of multivariate linear regression analysis model for ground water quality prediction. Guizhou Sci. 2006; 24(1) : 60 -2
  • 14. Niu Z, Zhang H, Liu H. Application of neural network to prediction of coastal water quality. J Tianjin Polytechnic Uni. 2006; 25(2) : 89 -92
  • 15. Lek S, Delacoste M, Baran P, Dimopoulos I, Lauga J, Aulagnier S. Application of neural networks to modelling nonlinear relationships in ecology. Ecological Model. 1996; 90(1) : 39 -52
  • 16. Hanbay D, Turkoglu I, Demir Y. Prediction of wastewater treatment plant performance based on wavelet packet decomposition and neural networks. Expert Systems Appl. 2008; 34(2) : 1038 -43
  • 17. Messikh N, Samar M, Messikh L. Neural network analysis of liquid-liquid extraction of phenol from wastewater using TBP solvent. Desalination. 2007; 208(1) : 42 -8
  • 18. Smits J, Breedveld L, Derksen M, Kateman G, Balfoort H, Snoek J. Pattern classification with artificial neural networks: classification of algae, based upon flow cytometer data. Analytica chimica acta. 1992; 258(1) : 11 -25
  • 19. Bowers J, Shedrow C. Predicting stream water quality using artificial neural networks. Miscellaneous series Westinghouse Savannah River Co. 2000; (112)
  • 20. Kuo JT, Hsieh MH, Lung WS, She N. Using artificial neural network for reservoir eutrophication prediction. Ecological Model. 2007; 200(1) : 171 -7
  • 21. Kuo YM, Liu CW, Lin KH. Evaluation of the ability of an artificial neural network model to assess the variation of groundwater quality in an area of blackfoot disease in Taiwan. Water Res. 2004; 38(1) : 148 -58 [DOI][PubMed]
  • 22. Ru-zhong L. Advance and trend analysis of theoretical methodology for water quality forecast. J Hefei UniTech (Natural Sci). 2006; 1 : 7
  • 23. Gallant SI. Neural network learning and expert systems: MIT press. 1993;
  • 24. Smith M. Neural networks for statistical modeling: Thomson Learning. 1993;
  • 25. Dreyfus G, Martinez JM, Samuelides M, Gordon MB, Badran F, Thiria S. Apprentissage statistique: Réseaux de neurones-Cartes topologiques-Machines à vecteurs supports: Eyrolles. 2011;
  • 26. Karunanithi N, Grenney WJ, Whitley D, Bovee K. Neural networks for river flow prediction. J Com Civil Eng. 1994; 8(2) : 201 -20
  • 27. Hussain M, Shafiur Rahman M, Ng C. Prediction of pores formation (porosity) in foods during drying: generic models by the use of hybrid neural network. J Food Eng. 2002; 51(3) : 239 -48
  • 28. Jorjani E, Chehreh Chelgani S, Mesroghli S. Application of artificial neural networks to predict chemical desulfurization of Tabas coal. Fuel. 2008; 87(12) : 2727 -34
  • 29. Razavi MA, Mortazavi A, Mousavi M. Dynamic modelling of milk ultrafiltration by artificial neural network. J Membrane Sci. 2003; 220(1) : 47 -58
  • 30. Ghaffari A, Abdollahi H, Khoshayand MR, Bozchalooi IS, Dadgar A, Rafiee-Tehrani M. Performance comparison of neural network training algorithms in modeling of bimodal drug delivery. Int J Pharm. 2006; 327(1-2) : 126 -38 [DOI][PubMed]
  • 31. Torrecilla J, Otero L, Sanz P. Optimization of an artificial neural network for thermal/pressure food processing: Evaluation of training algorithms. Com Elec Agri. 2007; 56(2) : 101 -10
  • 32. Erzin Y, Rao BH, Singh D. Artificial neural network models for predicting soil thermal resistivity. Int J Thermal Sci. 2008; 47(10) : 1347 -58
  • 33. Karul C, Soyupak S, Çilesiz AF, Akbay N, Germen E. Case studies on the use of neural networks in eutrophication modeling. Eco Model. 2000; 134(2) : 145 -52
  • 34. Vekerdy Z, Lakatos L, Balla G, Oroszlan G. An international replication, and the need for long term follow up studies. Arch Dis Child Fetal Neonatal Ed. 2006; 91(6)[DOI][PubMed]
  • 35. MacLeod SL, McClure EL, Wong CS. Laboratory calibration and field deployment of the polar organic chemical integrative sampler for pharmaceuticals and personal care products in wastewater and surface water. Environ Toxicol Chem. 2007; 26(12) : 2517 -29 [DOI][PubMed]
  • 36. Jang M, Lee HJ, Shim Y. Rapid removal of fine particles from mine water using sequential processes of coagulation and flocculation. Environ Technol. 2010; 31(4) : 423 -32 [DOI][PubMed]
  • 37. Adomako D, Nyarko BJ, Dampare SB, Serfor-Armah Y, Osae S, Fianko JR, et al. Determination of toxic elements in waters and sediments from River Subin in the Ashanti Region of Ghana. Environ Monit Assess. 2008; 141(1-3) : 165 -75 [DOI][PubMed]
  • 38. Hush DR, Horne BG. Progress in supervised neural networks. Signal Processing Magazine, IEEE. 1993; 10(1) : 8 -39
  • 39. Cheng J, Li Q, Xiao R. A new artificial neural network-based response surface method for structural reliability analysis. Probabilistic Engineering Mechanics. 2008; 23(1) : 51 -63
  • 40. Hagan MT, Menhaj MB. Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw. 1994; 5(6) : 989 -93 [DOI][PubMed]