Description
<p style="marginbottom: 0px; padding: 5px 0px 5px 10px; border: 0px; outline: 0px; verticalalign: baseline; fontfamily: Arial;">Hidden Markov Models are used in multiple areas of Machine Learning, such as speech recognition, handwritten letter recognition or natural language processing.</p><p style="marginbottom: 0px; padding: 5px 0px 5px 10px; border: 0px; outline: 0px; verticalalign: baseline; fontfamily: Arial;"><a name="HiddenMarkovModelsFormalDefinition" style="margin: 0px; padding: 0px; color: rgb(48, 76, 144); border: 0px; outline: 0px; verticalalign: baseline;"></a></p><h2 id="formaldefinition" style="margin: 0px; padding: 20px 10px 5px; fontfamily: Arial; fontweight: normal; lineheight: 27.299999237060547px; color: rgb(85, 85, 85); textrendering: optimizelegibility; fontsize: 1.5em; border: 0px; outline: 0px; verticalalign: baseline;">Formal Definition</h2><p style="marginbottom: 0px; padding: 5px 0px 5px 10px; border: 0px; outline: 0px; verticalalign: baseline; fontfamily: Arial;">A Hidden Markov Model (HMM) is a statistical model of a process consisting of two (in our case discrete) random variables O and Y, which change their state sequentially. The variable Y with states {y_1, ... , y_n} is called the "hidden variable", since its state is not directly observable. The state of Y changes sequentially with a so called  in our case firstorder  Markov Property. This means, that the state change probability of Y only depends on its current state and does not change in time. Formally we write: P(Y(t+1)=y_iY(0)...Y(t)) = P(Y(t+1)=y_iY(t)) = P(Y(2)=y_iY(1)). The variable O with states {o_1, ... , o_m} is called the "observable variable", since its state can be directly observed. O does not have a Markov Property, but its state probability depends statically on the current state of Y.</p><p style="marginbottom: 0px; padding: 5px 0px 5px 10px; border: 0px; outline: 0px; verticalalign: baseline; fontfamily: Arial;">Formally, an HMM is defined as a tuple M=(n,m,P,A,B), where n is the number of hidden states, m is the number of observable states, P is an ndimensional vector containing initial hidden state probabilities, A is the nxndimensional "transition matrix" containing the transition probabilities such that A[i,j](i,j.html) =P(Y(t)=y_iY(t1)=y_j) and B is the mxndimensional "emission matrix" containing the observation probabilities such that B[i,j]= P(O=o_iY=y_j).</p>

 requests
Discussion