Monad laws
The three laws
All instances of the Monad typeclass should satisfy the following laws:
Left identity:  return a

>>=  h

≡  h a

Right identity:  m

>>=  return

≡  m

Associativity:  (m >>= g)

>>=  h

≡  m >>= (\x > g x >>= h)

Here, p ≡ q simply means that you can replace p with q and viceversa, and the behaviour of your program will not change: p and q are equivalent.
Using etaexpansion, the associativity law can be rewritten for clarity as:
(m >>= (\x > g x)) >>= h
 
≡  m >>= (\x > g x >>= h)

or equally:
(m >>= (\x > g x)) >>= (\y > h y)
 
≡  m >>= (\x > g x >>= (\y > h y))

Is that really an "associative law"?
In this form, not at first glance. To see precisely why they're known as identity and associative laws, you have to change your notation slightly.
The monadcomposition operator (>=>)
(also known as the Kleislicomposition operator) is defined in Control.Monad
:
infixr 1 >=>
(>=>) :: Monad m => (a > m b) > (b > m c) > (a > m c)
f >=> g = \x > f x >>= g
Using this operator, the three laws can be expressed like this:
Left identity:  return

>=>  h

≡  h

Right identity:  f

>=>  return

≡  f

Associativity:  (f >=> g)

>=>  h

≡  f >=> (g >=> h)

It's now easy to see that monad composition is an associative operator with left and right identities.
This is a very important way to express the three monad laws, because they are precisely the laws that are required for monads to form a mathematical category. To summarise in haiku:
 Monad axioms:
 Kleisli composition forms
 a category.
The monad laws in practice
If we rewrite the laws using Haskell's do
notation:
Left identity:  do
x′ < return x
f x′

≡  do
f x
 
Right identity:  do
x < m
return x

≡  do
m
 
Associativity:  do
y < do
x < m
f x
g y

≡  do
x < m
do
y < f x
g y

≡  do
x < m
y < f x
g y

we can see that the laws represent plain, ordinary commonsense transformations of imperative programs.
But why should monadic types satisfy these laws?
When we see a program written in a form on the lefthand side, we expect it to do the same thing as the corresponding righthand side; and vice versa. And in practice, people do write like the lengthier lefthand side once in a while.
 First example: beginners tend to write
skip_and_get = do unused < getLine line < getLine return line
 and it would really throw off both beginners and veterans if that did not act like (by right identity):
skip_and_get = do unused < getLine getLine
 Second example: Next, you go ahead and use
skip_and_get
:
main = do answer < skip_and_get putStrLn answer
 The most popular way of comprehending this program is by inlining (whether the compiler does or not is an orthogonal issue):
main = do answer < do unused < getLine getLine putStrLn answer
 and applying associativity so you can pretend it is:
main = do unused < getLine answer < getLine putStrLn answer
The associativity law is amazingly pervasive: you have always assumed it, and you have never noticed it.
The associativity of a binary operator allows for any number of operands to be combined by applying the binary operator with any arbitrary grouping to get the same welldefined result, just like the result of summing up a list of numbers is fully defined by the binary (+)
operator no matter which parenthesization is used (yes, just like in folding a list of any type of monoidal values).
Whether compilers make use of them or not, you still want the laws for your own sake, just so you can avoid pulling your hair out over counterintuitive program behaviour, which depends (in brittle fashion!) on e.g. how many redundant return
s you insert or how you nest your do
blocks...