All sound is produced by vibration. Try to picture a grand piano in your mind. Think about the sound board, what it looks like and how it vibrates. Think about what the piano sounds like from the perspective of the player, standing next to it (in front of the opening) or standing several feet away. The complexity and variety of just this one ambient sound example is amazing.
Now think about a stereo pair of microphones picking up this sound from a specific location. All of the other vantage points are lost, and the complexity of the ambient sound is reduced. The sound is transformed into electrical impulses which travel through wires and sound equipment. Finally these electrical impulses cause a pair of speakers to vibrate in an attempt to recreate the sound. The size and shape of the speaker cones are very different from the sound board of the piano. The directionality of the speaker sound signal is also much more focused when compared to the wide dispersion pattern of the piano sound board.
This one example of electronic sound recreation paints a pretty good picture of the differences between ambient and speaker generated sound. Classical musicians are purists when it comes to this topic. If you attend a classical concert in a true concert hall, no (or hardly any) microphones and speakers are used for amplification. If you do see microphones, they are probably being used for recording. The hall is designed to naturally amplify the sound presented from the stage and apply a beautiful ambient reverb. The musicians hear themselves and each other through ambient sound only. The "monitor mix" (if we can call it that) is manipulated by the positioning of the musicians on the stage.
Unfortunately, not all music is capable of achieving this natural balance. A single acoustic guitar in a club or stadium will never be able to compete with a drum set and the other instruments in the band. Electric guitars and bass guitars are not even designed to produce much ambient sound. The tone is created through electronic signals received through pickups and then transmitted through amplifiers. To achieve balance, performers with these elements and environments need to use sound systems to provide the proper amplification and balance of sound.
Although speaker vibration can never exactly replicate ambient sound, separation helps with intelligibility. Separating a sound source across a wide array of speakers (like surround sound) helps to achieve some of the space found in ambient sound. However, monitor mixes are usually transmitted in mono through one speaker per performer (or group of similar performers). Can you imagine how many speakers it would take to provide separate surround sound mixes for every performer? The more sound signals you try to cram into one mono speaker (or a pair of stereo ear buds), the more cramped, muddy and unintelligible everything gets.
The degradation of sound quality from conversion to and from electronic impulses combined with the sound dispersion nature of speakers are the two main causes of monitor mix issues. When we add lack of separation to the equation, monitor mixes become balancing acts of what will fit and what we can afford to lose or turn down. This reality is the reason for everything that I wrote in part one and two of this series on monitors.