The first internet
site I looked into deeply was:
http://www1.ocdsb.edu.on.ca/srbhweb/math/Data%20Management/Charleen/Charleen%20on%20Aids.htm
www.assa.org.za/downloads/aids/summarystats.htm
I was not quite
sure yet what my project was going to look at directly so I just poked
around to see what I could find. They had a lot of data and I printed it
all so I could get a better understanding of what they had looked at.
They had made graphs already for their statistics but I also made my own
in Microsoft Excel.
The first graph I
made was that of the entire population of Africa for the next fifteen
years. This was estimated by the Actuarial Society of South Africa,
the author of the website. I was
unable to find an exact date that the information from the website was
last updated, but it does say in the statistics that all were estimated
from the year 2000 and forward. Therefore all of the statistics from
this website are only approximations based on the trends from previous
years. This could skew my data because the statistics are not
necessarily perfect and cannot be reliable since they are only
estimates.
I can’t
necessarily make any concrete conclusions from this graph but as can be
seen, experts estimate that the population will be increasing rapidly
until about 2007. Then it begins to descend at a rapid rate but the
slope is not as steep as the increase in population prior to 2007.
The second graph I
made represented the number of HIV infections in Africa. This data was
also collected from the same website and the statistics are again, only
approximations calculated by the Actuarial Society of South Africa. I
wanted to see if there was any relationship between the increase and
decrease patterns of the population and the number of HIV infections. It
could be expected that with an increase of population there would be an
increase in HIV infections and I would expect it to be the same with a
decrease in population as well.
The fact that
these statistics are approximations the graphs will generally look the
same because the trends may have been the same in the past but they may
not be that way in the future. It is impossible to create a perfect
estimate of future trends because it is exactly that, the future. We do
not know what will happen in Africa in the next ten years that could
affect the population or the number of HIV infections. We can see
however that based on these graphs there is a correlation between the
two. Both graphs are steadily increasing until about 2007 where both
begin to decrease at a little slower rate. We can also see in the graph
of HIV infections that the trendline is a close fit because the line
has a very high rsquared value. To get the closest fitting trendline
using Excel, I went to Insert and went to Add Trendline.
I tried all of the different types of trendlines to find the one with
the highest correlation coefficient, which ended up being the
polynomial line. This line shows the increase, the eventual peak in 2007
and then the decrease of the number of HIV infections over the 15 years
estimated.
At this point I am
still not sure what exactly I am looking for in my data so I continue to
make graphs and see if there are relationships between them. I decided
now to begin to look at the orphan statistics that I gathered. The
website from my previous two graphs had the information needed on it so
I made a graph in Excel from those statistics. Once again they are only
estimates from 2000 to 2015.
The graph shows a
very rapid increase in the number of AIDS orphans throughout the fifteen
years. I did not really understand why there would still be such a rapid
increase in the number of AIDS orphans because of the previous graph
that showed a peak in 2007 and then a decrease in the number of HIV
infections. I was wondering why this could happen and decided that this
should be my thesis question.
Therefore my thesis
question is: Why
do experts predict an ongoing increase for the number of AIDS orphans in
South Africa over the next fifteen years?
My hypothesis is:
Experts predict this increase because there are still a number of South
Africans living with the disease who are going to leave children behind
when they die in the future. Another factor affecting their estimate is
the age a child is considered an orphan until.
I wanted to look
into the definitions of an AIDS orphan first so I went back to my first
search using the "Google" search engine and decided to check out another
website at:
http://trochim.human.cornell.edu/gallery/Ruiz/home.html
The title for this
website was "What is your definition of orphan?" I found this
interesting because the age of a defined ‘AIDS orphan’ may show a
relation to the graph. They included two definitions in their report.
The first was the definition from the MerriamWebster Dictionary and it
said: "(1) a child deprived by death of one or usually both parents; (2)
a young animal that has lost its mother, and (3) one deprived of some
protection or advantage." (http://trochim.human.cornell.edu/gallery/Ruiz/monica5.htm)
The website also states that the definition used for AIDS orphan. " The
definition of ‘AIDS orphan’ used by UNAIDS, WHO [World Health
Organization] and UNICEF is of a child who loses his/her mother to AIDS
before reaching the age of 15 years. Some of these children have also
lost, or will later lose, their father to AIDS." (http://trochim.human.cornell.edu/gallery/Ruiz/monica5.htm)
I wanted to relate
this back to the graph of the number of AIDS orphans to prove that there
is a correlation between the age of a defined ‘AIDS orphan’ and the
increase in their number. Since the age a child is considered an orphan
until is age 15 years, this could be linked to my thesis. Because the
data is estimated for the next fifteen years, a child who loses their
mother or both parents to AIDS in the year 2000, will still be included
in the estimates until 2015. I found this very interesting because it
would explain why there is an increasing number. The number of children
who become AIDS orphans and the number of children who already are AIDS
orphans are not calculated separately. Therefore I wanted to attempt to
make those calculations myself. I made another spreadsheet in Excel and
to calculate the additional orphans for each year, I subtracted the
number of AIDS orphans from the previous years from the total
accumulated AIDS orphans. For each of the cells I entered: =(C3B3) for
example and came up with the differences for each year. This is the
spreadsheet.

2000 
2001 
2002 
2003 
2004 
2005 
2006 
2007 
Total AIDS
orphans (in middle of year) 
124,989 
190,993 
279,102 
391,137 
527,406 
685,354 
859,572 
1,039,210 
new AIDS
orphans 

66,004 
88,109 
112,035 
136269 
157948 
174218 
179638 

2008 
2009 
2010 
2011 
2012 
2013 
2014 
2015 
Total AIDS
orphans (in middle of year) 
1,218,488 
1,385,308 
1,531,229 
1,650,644 
1,741,139 
1,803,865 
1,840,262 
1,854,462 
new AIDS
orphans 
179278 
166820 
145921 
119415 
90495 
62726 
36397 
14200 
I also made a
graph for the new calculations I made and this is what it looks like.
I noticed that the
trend in this graph looked similar to the trends in the graphs for the
total number of HIV infections, the number of AIDS deaths and the
general population graph. I wanted to compare all of these graphs so I
compiled them all into one line graph in Excel to observe the trends.
Obviously, this
first graph does not clearly represent the data because of the
difference in the numbers on the yaxis. So I decided to try the graph
again but not use the population data because those numbers are the
highest and farthest away.
Once again, the
graph did not clearly represent my data or show the trends I was hoping
for. This is when I decided to do the same thing with the HIV infections
as I did with the AIDS orphans. Since the data given was for the
accumulating number of HIV infections, I calculated the number of new
HIV infections for each year. This is the spreadsheet from Excel and I
calculated the numbers using the same method I used for the new AIDS
orphans calculations.

2000 
2001 
2002 
2003 
2004 
2005 
2006 
2007 
new HIV
infections 

704,680 
590107 
469303 
345140 
221332 
103197 
2399 
new AIDS
orphans 

66,004 
88,109 
112,035 
136269 
157948 
174218 
179638 

2008 
2009 
2010 
2011 
2012 
2013 
2014 
2015 
new HIV
infections 
90090 
155433 
352310 
215689 
215565 
201664 
179306 
153075 
new AIDS
orphans 
179278 
166820 
145921 
119415 
90495 
62726 
36397 
14200 
This graph did not
work either because of the negative values I was getting for the number
of new HIV infections after the year 2007. I think the experts are
expecting there to be no new reported cases of HIV after 2007 and
because of the number of deaths, the number of HIV infections will go
down. I realised that this is not necessarily realistic since there is
no known cure for HIV or AIDS yet. This made me think why there would be
the negative values for those years. I looked back at the number of
deaths for those years and figured that the number of deaths from AIDS
was greater than the number of new HIV infections and that is why there
are negative values. This means that from these calculations I cannot
calculate the number of new HIV infections per year without knowing the
amount of AIDS deaths for that year alone. I could attempt to calculate
this but this is not an essential part of proving my thesis so I did not
want to concentrate on it too much. I did make a graph comparing the
number of AIDS deaths with the number of new AIDS orphans because I knew
there must be some correlation between the two. I figured this because
as the number of people who are dying from AIDS decreases, the number of
children who are orphaned by AIDS must decrease. This is the graph on
the next page.
By stretching this
graph vertically, I can see the trend I wanted to. The increase and
decrease seem to occur at almost the same time and the overall graphs
look the same. This can prove that the number of new orphans is
decreasing with the number of deaths caused by AIDS which is one
possible answer to my thesis question.
I also made a
relationship with another factor of the definition of an AIDS orphan.
The fact that to be considered an orphan the child must only lose their
mother, which is still a substantial loss, might have an effect on the
graph. I thought that maybe the decrease in the number of infections
reflected a very high decrease in the number of infected males but the
number of infected females was still on the rise. This could explain the
third graph because even if there was a decrease of infections in the
overall population, more women may still be dying. This would therefore
cause a continued increase in the number of AIDS orphans. This new idea
led me to research about the differences between males and females and
their infection rates over the next 15 years. I now wanted to prove that
there may be a decrease in the number of infected males but an increase
in the number of infected females.
The calculations I
made for the number of new orphans can help prove the first part of my
hypothesis also. The graph of the number of new AIDS orphans and the
number of AIDS deaths shows that there is some correlation but I was not
satisfied with the strength of that correlation. So I decided to make
the same calculations I made with the AIDS orphans and the number of HIV
infections for the number of AIDS deaths. I thought that if there was a
strong enough correlation and a relatively small difference in numbers
of these two factors, my hypothesis would be right. I said a small
difference in numbers because there should be close to the same amount
of new AIDS orphans as there are new AIDS deaths. The fact that some
parents who die from AIDS may have more than one child may be the reason
for any difference in numbers. As I was writing that last sentence, I
also came up with something that might skew my data. For the data I am
using an AIDS orphan is only defined as a child who loses their mother
to AIDS, therefore after I graph the deaths for both genders I will find
the information for female deaths from AIDS and compare those numbers as
well. This should prove to be an even stronger correlation.
This graph shows a
lot of correlation but not enough to make a strong conclusion about.
Therefore I went back to the ‘Google’ search engine to find the data for
women dying from AIDS until 2015. I went back to the site with the
information for almost all the calculations I have made so far and found
prevalence rates for women ages 20 to 65 years old. This data was in
percentage form and I was not quite sure if that number was a percentage
of the total population or of the population infected with HIV. I
decided to look for more reliable data and ended up looking at the
United Nations website. The website is:
www.unstats.un.org. The only statistic on females living with AIDS
was once again a percentage for the prevalence of HIV/AIDS among adult
women. This was not what I was looking for because I needed the number
of deaths of females due to AIDS. I went back to the website
www.assa.org.za/downloads/aids/summarystats.htm
where the total number of AIDS deaths were but did not find
any information for the number of female deaths from AIDS. Therefore I
could not make my conclusion based on that fact, but the graph for both
genders does show there is some relationship between the number of new
AIDS orphans and the number of new AIDS deaths each year. Unfortunately,
the data was not available but I think if it was the correlation would
prove my hypothesis.
