Epidemiologic studies have established that pregnancy has a bidirectional, time-dependent effect on breast cancer risk; a period of elevated risk is followed by a long-term period of protection. The purpose of the present study was to determine whether pregnancy and involution are associated with gene expression changes in the normal breast, and whether such changes are transient or persistent. We examined the expression of a customized gene set in normal breast tissue from nulliparous, recently pregnant (0-2 years since pregnancy), and distantly pregnant (5-10 years since pregnancy) age-matched premenopausal women. This gene set included breast cancer biomarkers and genes related to immune/inflammation, extracellular matrix remodeling, angiogenesis, and hormone signaling. Laser capture microdissection and RNA extraction were done from formalin-fixed paraffin-embedded reduction mammoplasty and benign biopsy specimens and analyzed using real-time PCR arrays containing 59 pathway-specific and 5 housekeeping genes. We report 14 of 64 (22%) of the selected gene set to be differentially regulated (at P < 0.05 level) in nulliparous versus parous breast tissues. Based on gene set analysis, inflammation-associated genes were significantly upregulated as a group in both parous groups compared with nulliparous women (P = 0.03). Moreover, parous subjects had significantly reduced expression of estrogen receptor α (ERα, ESR1), progesterone receptor (PGR), and ERBB2 (Her2/neu) and 2-fold higher estrogen receptor-β (ESR2) expression compared with nulliparous subjects. These initial data, among the first on gene expression in samples of normal human breast, provide intriguing clues about the mechanisms behind the time-dependent effects of pregnancy on breast cancer risk. Cancer Prev Res; 3(3); 301–11
- lobular involution
- extracellular matrix
The existence of a link between pregnancy and breast cancer was observed centuries ago, when it was noted that nuns had elevated breast cancer risk (1). This observation led to the hypothesis that pregnancy is protective against breast cancer (2), which was subsequently supported by numerous epidemiologic analyses (3, 4). One outcome of these studies was the finding that the protective effect is not immediate and that there is a transient increase in breast cancer risk in the years following pregnancy (5–8). Breast cancer risk peaks 5 to 7 years after delivery, gradually decreases in the following years, and eventually falls below the level of the nulliparous population, at which point pregnancy is protective (9). The magnitude and duration of this increase in risk depends on several factors including parity and age at first delivery, with younger first time mothers showing the shortest duration of increased risk (7). Moreover, breast cancers diagnosed during or soon after pregnancy, referred to as pregnancy-associated breast cancers (PABC), tend to be more aggressive (10–12). Pregnancy therefore seems to exert a bidirectional, time-dependent effect on breast cancer; it is protective in the long-term, but in the short-term, tumor incidence and aggressiveness are increased. The transient increase in risk associated with pregnancy has been attributed, quite logically, to the stimulatory effects of the hormonal milieu on early, incipient lesions that may be present in the breast at the time of pregnancy or shortly thereafter. A more recent hypothesis asserts that growth stimulation may result from the process of involution itself because this is a tissue remodeling process analogous to wound healing, in which angiogenesis, inflammation, and extracellular matrix (ECM) alterations are activated (9, 13). Thus, it seems likely that several processes, each with its own temporal dynamics and downstream effects on cancer incidence, may be operating simultaneously in breast tissue after pregnancy.
Surprisingly, very few studies have been able to examine changes in gene expression in normal breast tissue in relation to time since pregnancy. Recent improvement in techniques for quantifying gene expression in archived formalin-fixed paraffin-embedded (FFPE) tissue specimens has opened opportunities for addressing these issues in population studies. In this report, we present evidence for the validity of a quantitative real-time PCR method for measuring the expression of panels of selected genes in well-defined cell populations collected by laser capture microdissection of lobular units from reduction mammoplasty and benign breast biopsies. We then compare the expression of specific gene sets associated with angiogenesis, ECM remodeling, inflammation, and hormone signaling in normal breast epithelium from three groups of similarly aged premenopausal women: nulliparous, recently pregnant (0-2 years since last pregnancy), and distantly pregnant (5-10 years since last pregnancy). Our objective was to evaluate the association between recent pregnancy and involution and hormone-related gene expression in normal breast and to determine whether such changes seemed to be temporary or persistent.
Materials and Methods
Patients and samples
Patients were women between 18 and 45 y of age who underwent a reduction mammoplasty or breast biopsy with benign findings at the University of Illinois at Chicago Hospital (Table 1). This study was approved by the Institutional Review Board of the University of Illinois at Chicago. Patients were eligible for study if their paraffin blocks were available and the date of their last pregnancy was noted in their medical record. The oldest samples used in the study were 5 y old. Patients were divided into categories according to the time elapsed since their last pregnancy at the time of tissue collection, as follows: nulliparous, recent pregnancy (0 to 2 y since last pregnancy), and distant pregnancy (5 to 10 y since pregnancy). RNA of acceptable quality was extracted from 20 nulliparous, 17 recent pregnancy, and 15 distant pregnancy samples.
Laser capture microdissection
Breast lobules were isolated from FFPE tissues using a Leica AS/LMD laser capture microdissection system (Supplementary Fig. S1). For each sample, 10-μm sections were cut, deparaffinized, and rehydrated using standard protocols with RNase-free reagents and stained with 0.025% toluidine blue. Stained slides were dehydrated to 100% ethanol before laser capture microdissection. Before RNA extraction, adjacent 4-μm H&E-stained sections were mapped by a pathologist to ensure that only physiologically normal tissue would be used in the study. Approximately 10 to 40 lobules were cut per sample, depending on lobule size, yielding 100 to 200 ng of total RNA.
RNA isolation, cDNA synthesis, and linear amplification
RNA was isolated using the Recoverall kit (Ambion) following the manufacturer's instructions. Because contaminating genomic DNA was not completely eliminated by on-the-column DNase treatment, a second DNase treatment was done on the isolated RNA using the Ambion Turbo DNA-free kit. Following the second DNase treatment, RNA was considered DNA-free because 40 real-time PCR cycles failed to give an amplification signal. Isolated RNA was concentrated using the ethanol precipitation method with ammonium acetate and linear acrylamide as carrier. For each sample, 100 ng of RNA were reverse transcribed using the Retroscript kit (Ambion). Amplicons of interest were linearly amplified using the Preamp kit (Applied Biosystems) with 10 cycles of amplification.
We selected 64 genes in total (Supplementary Table S1): 54 genes associated with breast involution via the processes of inflammation, ECM remodeling, or angiogenesis; 5 genes (ESR1, ESR2, PGR, ERBB2, and PRKCA) related to estrogen signaling and breast cancer prognosis; and 5 housekeeping genes for data normalization. When choosing the gene list for our study, we relied on two mouse microarray experiments that explored the gene expression profile of involution (14, 15). Other genes, such as matrix metalloproteinases and key interleukins, were chosen based on their importance in the processes of matrix remodeling and inflammation (16–18). TaqMan gene expression assays were purchased preloaded into 384-well plates (each gene in duplicate, fitting three samples per plate) from Applied Biosystems. Assays were designed with small amplicons (<100 bp) to enhance detection sensitivity and reduce bias in degraded tissue. Real-time PCR reactions were carried out in an Applied Biosystems 7900HT machine. For each gene of interest, expression levels were normalized to the average expression of ACTB and HPRT1, which were the two independent housekeeping genes with the highest correlation within samples. ΔCT values for each gene can be found in Supplementary Table S2.
Expression of individual genes was compared between groups with either a two-sided t test, when two groups were compared, or a one-way ANOVA, followed by Tukey's honestly significant difference (HSD) test, when more than two groups were compared. The results of the t test for each gene comparing the nulliparous versus the parous groups can be found in Supplementary Table S3. Unsupervised hierarchical clustering was done using the average linkage method to determine coordinate expression of sets of genes (Cluster software; ref. 19), and results were visualized with Treeview (19). A weighted-voting procedure with leave-one-out cross validation was used to identify the optimal gene subset (n = 11) for classifying samples by parity group (GenePattern; ref. 20) and significance analysis of microarray-gene set (SAM-GS) analysis was used for gene set hypothesis testing (21).
We performed preliminary studies to show the validity of our approach to gene expression measurement in archival FFPE tissues. We compared the gene expression levels of ERα in a set of 16 breast tumor samples for which the ERα status was known as determined by immunohistochemistry. Levels of mRNA correlated well with the ERα status, therefore supporting the validity of the technique (Fig. 1).
Due to the effects of fixation, paraffin embedding, and storage and the limited RNA material that could be obtained by laser capture microdissection, about 40% of our genes of interest could not be detected by real-time PCR alone. Therefore, linear amplification was used, which, while preserving the relative ratios of the genes within a sample, ensured that close to 90% of the genes would be detected. In addition, detection occurred at lower threshold cycles, therefore increasing the reproducibility of the results. Figure 2A and B shows linearity for ERα expression relative to glyceraldehyde-3-phosphate dehydrogenase, comparing amplified with nonamplified samples. Figure 2C shows that preamplification was approaching absolute linearity for our entire gene set of 64 genes, as shown by the correlation coefficient of 0.98 between the CT values obtained in nonamplified versus preamplified samples.
Genes differentially expressed between the nulliparous and postpregnancy groups
To determine whether the post-pregnancy period would be characterized by a different gene expression pattern than the nulliparous state, recent and distant parous samples were combined and compared with the nulliparous group using a two-sided t test (Table 2).
We found that 14 of the 64 genes selected (59 pathway specific and 5 housekeeping, Supplementary Table S1), or 22% of the total genes examined, were differentially expressed at the P < 0.05 significance level. Five of these genes belong to the inflammation-related category: chemokine ligand 21 (CCL21), lipopolysaccharide binding protein (LBP), serum amyloids A1/A2 (SAA1/2), and immunoglobulin κ constant region (IGKC) were upregulated, whereas immunoglobulin heavy δ chain (IGHD) was downregulated. Therefore, 5 of 28 inflammation-related genes chosen for analysis, or 18%, were differentially regulated between the nulliparous and parous groups. We applied SAM-GS analysis (21) to test the hypothesis that parity status was associated with expression of 28 genes in the inflammation gene set as a group. This analysis, which is similar to a summary t-test approach, yielded a P value of 0.03, indicating that the process of inflammation is differentially regulated between the nulliparous and parous groups.
Our differentially expressed gene list included six angiogenesis-related genes, and of these, angiopoietin 1 (ANGPT1) and vascular endothelial growth factor (VEGFA) were significantly differentially regulated between the two groups. ANGPT1 is upregulated whereas VEGFA is downregulated in the parous group. These findings do not suggest an obvious role for angiogenesis in the parous breast, at least not as detected with the small gene set used. Twenty ECM-related genes were included in our array; two of these genes, E-cadherin (CDH1) and tissue inhibitor of metalloproteinase 2 (TIMP2), were significantly differentially regulated, whereas collagen type 1 (COL1A1) and transforming growth factor β3 (TGFB3) were both downregulated, but with P values slightly above the conventional threshold (P = 0.051 and P = 0.057, respectively). Gene set analysis indicated that an association between ECM-related gene expression and parity status did not reach statistical significance (P = 0.09). The most striking finding was the highly significant differential expression of ERα, PGR, ERβ, and ERBB2 between the nulliparous and parous groups. ERα, PGR, and ERBB2 were downregulated and ERβ was upregulated in the parous group relative to the nulliparous group.
We were especially interested in identifying genes that might be differentially expressed either transiently or more persistently following a pregnancy, as it has been shown that PABCs have the worst prognosis if detected during or within a few years after pregnancy (9). Expression of three of the inflammation-related genes, CCL21, LBP, and SAA1/2, was upregulated in both parous groups compared with the nulliparous group (Fig. 3), consistent with persistent effects. Interestingly, B2M, initially chosen as a housekeeping gene, was also significantly upregulated in both parous groups. Although B2M is frequently used as a control in gene expression experiments, it is part of the MHC class I complex. Two genes, PGR and TGFB3, exhibited biphasic expression with reduced expression only in the recently pregnant group compared with the nulliparous and distantly pregnant groups (Fig. 4).
Examination of breast cancer biomarker gene expression in the three groups revealed consistent and potentially persistent gene regulation changes (Fig. 5). Relative to nulliparous subjects, ERα expression was downregulated with statistical significance in the recently parous group and remained downregulated (with borderline statistical significance, P = 0.076) in the distantly parous group. ERβ is upregulated in a similar manner, achieving significance in the recent parous group with near-significant upregulation in the distantly parous group. Likewise, ERBB2 was significantly downregulated in the recently pregnant group compared with nulliparous samples and showed a trend toward persistent downregulation in the distantly pregnant group.
Identification of an inflammation-specific gene signature
Because gene set analysis identified an association of inflammation-related gene expression within the parous breast, we decided to test the power of our 64-gene set in detecting inflammation/matrix remodeling in a set of nonmalignant breast tissue samples in which inflammation had been caused by either a previous biopsy or bacterial infection. A pathologist identified the biopsy sites from four patients with previous biopsies and marked the inflamed tissue. From the same patients, control blocks were also chosen that did not show any histologic signs of inflammation. Only normal, noncancerous tissue was used. Three samples that contained regions of mastitis-associated abscess were also included to represent inflammation caused by bacterial infection, rather than the wound-like inflammation caused by a biopsy procedure. The gene expression profiles of these samples were compared using a nonsupervised clustering technique. To avoid diluting the effect of the inflammation specific genes, we selected a subset of 32 genes that was able to perfectly classify noninflamed tissue and each type of inflamed tissue (Fig. 6A). The 32-gene subset was then applied for nonsupervised clustering analysis to all the specimens, including the inflamed reference samples, to further evaluate the parity-inflammation association (Fig. 6B). Although a strong association between inflammation and any particular patient category was not identified, there is a suggestive level of clustering, with 14 of 20 nulliparous samples separating from the inflammatory samples at the first bifurcation. However, a few samples (both nulliparous and parous) co-clustered with the inflammatory samples and showed similar expression patterns.
To determine whether we could identify a subset of genes that optimally differentiates the nulliparous and parous breast specimens, a weighted-voting procedure with leave-one-out cross validation was used to identify a set of 11 genes that most strongly discriminated between parity groups. In nonsupervised clustering, this 11-gene signature strongly segregated the nulliparous and parous groups (Fig. 7). This subset of genes, which was mostly composed of inflammation and hormone signaling genes, completely segregated the nulliparous specimens from the parous groups at the first bifurcation.
To our knowledge, this is the first report to examine gene expression in the normal premenopausal human breast, comparing nulliparous with recently or more distantly pregnant age-matched women. One objective was to look for evidence of inflammation, ECM remodeling, and angiogenesis in the normal breast immediately following pregnancy because these processes are known to be active during involution and were hypothesized to contribute to PABCs (9). However, we were also interested in determining whether parity was associated with altered gene expression in key hormone signaling pathways. We found that 22% of the selected genes were significantly differentially regulated in nulliparous versus parous breast tissue. In particular, inflammation-related genes were found to strongly discriminate nulliparous and parous specimens as determined by gene-by-gene comparisons (t test), hierarchical clustering, and SAM-GS analysis. Furthermore, expression profiles of the hormone signaling genes ESR1, ESR2, PGR, and ERBB2 indicate parity-mediated protective effects. These results provide evidence for the first time that pregnancy is associated with persistent changes in gene expression in normal breast tissue that could contribute to the protective and stimulatory effects of pregnancy on breast cancer risk.
In agreement with animal models of involution and parity (14, 15, 22), our results show an increase in immune/inflammatory activity in the post-pregnancy breast as suggested by the upregulation of LBP, SAA1/2, and CCL21. Interestingly, this response was not limited to the recently pregnant group but also characterized more distant pregnancies as well. This is surprising, given that the animal studies showed an upregulation of numerous inflammation/immune response–related genes in the early days of involution, followed by a diminution of the effect as involution progressed. However in the human breast, it seems that pregnancy has a lasting effect detectable for up to 10 years, but quite possibly even longer. Lipopolysaccharide-binding protein (LBP) is an acute phase protein markedly induced during inflammation and infection (23, 24). Its major site of synthesis is the liver, but LBP is also produced in the intestine and lungs, where it may play a role in local responses to bacterial lipopolysaccharide (25, 26). Serum amyloids (SAA1/2) are multifunctional acute phase proteins possessing both pro- and anti-inflammatory activities that can increase 1,000-fold in the blood in response to inflammation and infection (27). Gene expression studies revealed that they are expressed in normal, pathologic, inflammatory, and tumor tissues, predominantly by epithelial cells and macrophages (27–29). CCL21, a chemokine expressed mainly in lymph nodes, has a critical role in the homing of T cells to these organs (30, 31). In breast cancer cells, CCL21 stimulates pseudopod formation and induces directional migration through a reconstituted basement membrane (32). We also found an upregulation of immunoglobulin κ light chains, presumably due to an increased lymphocytic presence. Conversely, immunoglobulin δ (IgD) expression was repressed, as shown by downregulation of its heavy chain (IGHD). IgD accounts for less than 1% of plasma immunoglobulins; it is found in the plasma membrane of circulating B lymphocytes and its exact role is still unknown (33).
In perhaps the only previous study to address the effect of parity on gene expression profiles in the normal breast, Russo and coworkers applied cDNA microarrays to ethanol-fixed tissue obtained from postmenopausal women (34–36). In this discovery effort, a large number of genes were found to be differentially expressed between parous and nulliparous women, including a number of immune-related genes that were upregulated in the parous group. The time since last pregnancy was not specified, although it was substantially longer than in our study because the average age at tissue sampling was >65 years for the parous women. There is no overlap between the immune-related genes identified by the Russo group and our group, potentially due to differences in menopausal status, time since pregnancy, or the technologies used (real-time PCR versus cDNA microarrays).
On close examination of the 32-gene signature that best discriminates the inflammatory and noninflammatory breast tissue, we find that individual gene expression in the parous versus nulliparous specimens is not always in the same direction as found in the inflammatory versus noninflammatory specimens (Fig. 6). For example, SAA1/2 is upregulated in parous versus nulliparous specimens, whereas it is downregulated in inflammatory versus noninflammatory breast tissues. Whereas this finding on its surface seems counterintuitive, it is known that inflammation is a dynamic process that is tightly controlled both locally and temporally, with key genes being turned on and off as required during the progression of inflammation and subsequent healing. Therefore, it is possible that the sample set that was used to derive the inflammation-specific gene signature represents a phase or type of inflammation different than the one that might characterize the parous human breast.
The process of breast involution involves substantial ECM remodeling. We find that E-cadherin is significantly downregulated in parous tissues (Table 2). E-cadherin is a cell-cell adhesion molecule expressed in most epithelial cells and is involved in the formation of adherens junctions. Loss of E-cadherin is a frequent event during invasion and metastasis (37), and noninvasive tumor cells become invasive if E-cadherin expression and function is lost (38). There is evidence that suggests that in addition to functioning as a metastasis suppressor, E-cadherin is also a tumor suppressor, as disturbance of E-cadherin–mediated adhesion induces neoplastic growth (39). We hypothesize that reduced E-cadherin levels might contribute to the increase in cancer incidence and aggressiveness post-pregnancy.
Our finding that expression of ERα is downregulated following pregnancy in relatively young women suggests a possible mechanism for the long-term protective effect of pregnancy on breast cancer risk. If the ERα-positive cell population is reduced, the number of cell divisions driven by estrogen and the potential development and progression of cancer should be reduced. A decrease in ERα expression after pregnancy has been described in animal models, and in the human breast, there is immunohistochemical evidence for decreased ERα expression in undifferentiated breast lobules that are associated with the nulliparous state (40, 41). However, the observation, on the mRNA level, that this reduction occurs soon after pregnancy and that it persists is novel. PGR, an ERα target gene, was also downregulated in the recently pregnant subjects but seemed to return toward nulliparous levels in the more distantly pregnant group. Recently, Taylor et al. (42) compared the expression of PgR (both A and B isoforms) and ERα, by immunohistochemistry, in reduction mammoplasty specimens from nulliparous versus parous premenopausal women. PgRA staining was significantly reduced by two thirds in the parous subjects; PgRB and ERα staining were also reduced, but nonsignificantly, in this small study with limited statistical power.
The increase in ERβ expression following pregnancy suggests another protective mechanism, as ERβ generally antagonizes the proliferative effects mediated by ERα (43) and has been associated with good prognosis in breast cancer (44). The receptor tyrosine kinase gene ERBB2 or HER2/neu is amplified in 25% to 30% of all breast cancers and is associated with poor prognosis (45), whereas its role in the normal breast is linked to pregnancy and lactation in the mouse mammary gland (46). Recent evidence suggests that ERα and ErbB2 collaborate in mediating estrogen-induced mitogenesis, and therefore, the downregulation of ERBB2 in the parous group also may be linked to a protective effect (47).
The present study has several notable strengths, including the demonstration of a valid technique for measuring gene expression in formalin-fixed breast tissue, the use of laser microdissection to ensure collection of homogeneous material, frequency-matching of comparison groups by age, and the inclusion of both recently and less recently pregnant subjects. However, several limitations must also be noted. RNA degradation increases with storage time for fixed-tissue samples; therefore, only archival tissues less than 5 years old could be included in the study. This limitation, coupled to the necessity to include specimens with reliable pregnancy history information, required us to screen more than 300 medical records to obtain 52 eligible cases reported in this study. Although this sample size is relatively small, the finding that 22% of the selected gene set showed statistically significant differential gene expression and the biological consistency of the results, especially for genes related to estrogen signaling, suggest that the observed differences were not likely to be due to chance.
In conclusion, we found evidence for altered expression of a number of immune/inflammation-related genes in the parous human breast. These results are consistent with the hypothesis that the parous breast undergoes a weak but persistent inflammatory process that may explain the aggressiveness and frequency of PABC. At the same time, we also found evidence for changes in sensitivity to estrogen signaling that could contribute to the long-term protective effect of pregnancy against breast cancer. We are in the process of constructing tumor microarrays to correlate protein with gene expression. This report is the first to describe the relationship of parity to gene expression specific to the processes of inflammation, ECM remodeling, angiogenesis, and hormone signaling in normal premenopausal breast tissue. The present data provide a foundation for future studies that are needed to address the short-term and long-term effects of pregnancy on the development of breast cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
We thank the Avon Products Foundation for supporting the current work.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Prevention Research Online (http://cancerprevres.aacrjournals.org).
- Received April 7, 2009.
- Revision received November 17, 2009.
- Accepted November 23, 2009.
- ©2010 American Association for Cancer Research.