The search for oncogenes is becoming increasingly important in cancer genetics because they are suitable targets for therapeutic intervention. To identify novel oncogenes, activated by gene amplification, we analyzed cDNA microarrays by high-resolution comparative genome hybridization and compared DNA copy number and mRNA expression levels in lung cancer cell lines. We identified several amplicons (5p13, 6p22-21, 11q13, 17q21 and 19q13) that had a concomitant increase in gene expression. These regions were also found to be amplified in lung primary tumours. We mapped the boundaries and measured expression levels of genes within the chromosome 6p amplicon. The Sry-HMG box gene SOX4 (sex-determining region Y box 4), which encodes a transcription factor involved in embryonic cell differentiation, was overexpressed by a factor of 10 in cells with amplification relative to normal cells. SOX4 expression was also stronger in a fraction of lung primary tumours and lung cancer cell lines and was associated with the presence of gene amplification. We also found variants of SOX4 in lung primary tumours and cancer cell lines, including a somatic mutation that introduced a premature stop codon (S395X) at the serine-rich C-terminal domain. Although none of the variants increased the transactivation ability of SOX4, overexpression of the wildtype and of the non-truncated variants in NIH3T3 cells significantly increased the transforming ability of the weakly oncogenic RHOA-Q63L. In conclusion, our results show that, in lung cancer, SOX4 is overexpressed due to gene amplification and provide evidence of oncogenic properties of SOX4.