Derivation of $frac{partial }{partial Sigma} left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma)) right)$












3












$begingroup$


I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.



For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.



Also, $U_{MSigma}$ is symmetric and positive definite.



We know that



begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(-frac{1}{2}U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}text{vec}(U_{MSigma}^{-1})^intercal text{vec}(I_Totimes dSigma))
end{align*}



and we notice that



begin{align*}
text{vec}(I_Totimes dSigma)&=left[
begin{array}{c}
text{vec}(e_1otimes dSigma) \
vdots\
vec(e_Totimes dSigma)
end{array}
right]=left[
begin{array}{c}
((e_1otimes I_D)otimes I_D) text{vec}(dSigma) \
vdots\
((e_1otimes I_D)otimes I_D) text{vec}(dSigma)
end{array}
right]\
&=(text{vec}(I_T)otimes I_{D^2}) text{vec}(dSigma)
end{align*}



Therefore we have
begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2}) vec(dSigma))\
&=Tr(-frac{1}{2}(Gamma_{Sigma})^intercal dSigma)
end{align*}



Such that $Gamma_{Sigma}$ is defined by $vec(Gamma_{Sigma})^intercal=vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2})$, and $frac{partial}{partial Sigma}left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)=-frac{1}{2}Gamma_{Sigma}$.




Extra Points for help with the derivation below:



begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=-v^intercal (-U_{MSigma})^{-1}(dU_{MSigma})U_{MSigma}^{-1}v\
&=v^intercal U_{MSigma}^{-1}(I_Totimes dSigma)U_{MSigma}^{-1}v\
&=v_{Sigma}^intercal U_{MSigma}^{-1}vec(dSigma W_{MSigma} I_T)
end{align*}



where $W_{MSigma}$ is defined by being conformable and $vec(W_{MSigma}) = U_{MSigma}^{-1}v$. Therefore, we have
begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=Tr(W_{MSigma}^intercal dSigma W_{MSigma} I_T)\
&=Tr(W_{MSigma} W_{MSigma}^intercal dSigma)
end{align*}



And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.



Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.



Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$



$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$










share|cite|improve this question











$endgroup$












  • $begingroup$
    There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
    $endgroup$
    – greg
    Jan 20 at 3:46












  • $begingroup$
    @greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
    $endgroup$
    – An old man in the sea.
    Jan 20 at 13:55


















3












$begingroup$


I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.



For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.



Also, $U_{MSigma}$ is symmetric and positive definite.



We know that



begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(-frac{1}{2}U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}text{vec}(U_{MSigma}^{-1})^intercal text{vec}(I_Totimes dSigma))
end{align*}



and we notice that



begin{align*}
text{vec}(I_Totimes dSigma)&=left[
begin{array}{c}
text{vec}(e_1otimes dSigma) \
vdots\
vec(e_Totimes dSigma)
end{array}
right]=left[
begin{array}{c}
((e_1otimes I_D)otimes I_D) text{vec}(dSigma) \
vdots\
((e_1otimes I_D)otimes I_D) text{vec}(dSigma)
end{array}
right]\
&=(text{vec}(I_T)otimes I_{D^2}) text{vec}(dSigma)
end{align*}



Therefore we have
begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2}) vec(dSigma))\
&=Tr(-frac{1}{2}(Gamma_{Sigma})^intercal dSigma)
end{align*}



Such that $Gamma_{Sigma}$ is defined by $vec(Gamma_{Sigma})^intercal=vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2})$, and $frac{partial}{partial Sigma}left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)=-frac{1}{2}Gamma_{Sigma}$.




Extra Points for help with the derivation below:



begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=-v^intercal (-U_{MSigma})^{-1}(dU_{MSigma})U_{MSigma}^{-1}v\
&=v^intercal U_{MSigma}^{-1}(I_Totimes dSigma)U_{MSigma}^{-1}v\
&=v_{Sigma}^intercal U_{MSigma}^{-1}vec(dSigma W_{MSigma} I_T)
end{align*}



where $W_{MSigma}$ is defined by being conformable and $vec(W_{MSigma}) = U_{MSigma}^{-1}v$. Therefore, we have
begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=Tr(W_{MSigma}^intercal dSigma W_{MSigma} I_T)\
&=Tr(W_{MSigma} W_{MSigma}^intercal dSigma)
end{align*}



And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.



Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.



Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$



$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$










share|cite|improve this question











$endgroup$












  • $begingroup$
    There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
    $endgroup$
    – greg
    Jan 20 at 3:46












  • $begingroup$
    @greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
    $endgroup$
    – An old man in the sea.
    Jan 20 at 13:55
















3












3








3


1



$begingroup$


I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.



For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.



Also, $U_{MSigma}$ is symmetric and positive definite.



We know that



begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(-frac{1}{2}U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}text{vec}(U_{MSigma}^{-1})^intercal text{vec}(I_Totimes dSigma))
end{align*}



and we notice that



begin{align*}
text{vec}(I_Totimes dSigma)&=left[
begin{array}{c}
text{vec}(e_1otimes dSigma) \
vdots\
vec(e_Totimes dSigma)
end{array}
right]=left[
begin{array}{c}
((e_1otimes I_D)otimes I_D) text{vec}(dSigma) \
vdots\
((e_1otimes I_D)otimes I_D) text{vec}(dSigma)
end{array}
right]\
&=(text{vec}(I_T)otimes I_{D^2}) text{vec}(dSigma)
end{align*}



Therefore we have
begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2}) vec(dSigma))\
&=Tr(-frac{1}{2}(Gamma_{Sigma})^intercal dSigma)
end{align*}



Such that $Gamma_{Sigma}$ is defined by $vec(Gamma_{Sigma})^intercal=vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2})$, and $frac{partial}{partial Sigma}left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)=-frac{1}{2}Gamma_{Sigma}$.




Extra Points for help with the derivation below:



begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=-v^intercal (-U_{MSigma})^{-1}(dU_{MSigma})U_{MSigma}^{-1}v\
&=v^intercal U_{MSigma}^{-1}(I_Totimes dSigma)U_{MSigma}^{-1}v\
&=v_{Sigma}^intercal U_{MSigma}^{-1}vec(dSigma W_{MSigma} I_T)
end{align*}



where $W_{MSigma}$ is defined by being conformable and $vec(W_{MSigma}) = U_{MSigma}^{-1}v$. Therefore, we have
begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=Tr(W_{MSigma}^intercal dSigma W_{MSigma} I_T)\
&=Tr(W_{MSigma} W_{MSigma}^intercal dSigma)
end{align*}



And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.



Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.



Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$



$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$










share|cite|improve this question











$endgroup$




I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.



For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.



Also, $U_{MSigma}$ is symmetric and positive definite.



We know that



begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(-frac{1}{2}U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}text{vec}(U_{MSigma}^{-1})^intercal text{vec}(I_Totimes dSigma))
end{align*}



and we notice that



begin{align*}
text{vec}(I_Totimes dSigma)&=left[
begin{array}{c}
text{vec}(e_1otimes dSigma) \
vdots\
vec(e_Totimes dSigma)
end{array}
right]=left[
begin{array}{c}
((e_1otimes I_D)otimes I_D) text{vec}(dSigma) \
vdots\
((e_1otimes I_D)otimes I_D) text{vec}(dSigma)
end{array}
right]\
&=(text{vec}(I_T)otimes I_{D^2}) text{vec}(dSigma)
end{align*}



Therefore we have
begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2}) vec(dSigma))\
&=Tr(-frac{1}{2}(Gamma_{Sigma})^intercal dSigma)
end{align*}



Such that $Gamma_{Sigma}$ is defined by $vec(Gamma_{Sigma})^intercal=vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2})$, and $frac{partial}{partial Sigma}left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)=-frac{1}{2}Gamma_{Sigma}$.




Extra Points for help with the derivation below:



begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=-v^intercal (-U_{MSigma})^{-1}(dU_{MSigma})U_{MSigma}^{-1}v\
&=v^intercal U_{MSigma}^{-1}(I_Totimes dSigma)U_{MSigma}^{-1}v\
&=v_{Sigma}^intercal U_{MSigma}^{-1}vec(dSigma W_{MSigma} I_T)
end{align*}



where $W_{MSigma}$ is defined by being conformable and $vec(W_{MSigma}) = U_{MSigma}^{-1}v$. Therefore, we have
begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=Tr(W_{MSigma}^intercal dSigma W_{MSigma} I_T)\
&=Tr(W_{MSigma} W_{MSigma}^intercal dSigma)
end{align*}



And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.



Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.



Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$



$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$







proof-verification matrix-calculus






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Feb 1 at 13:43









Martin Sleziak

44.8k10118272




44.8k10118272










asked Jan 19 at 15:13









An old man in the sea.An old man in the sea.

1,64511134




1,64511134












  • $begingroup$
    There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
    $endgroup$
    – greg
    Jan 20 at 3:46












  • $begingroup$
    @greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
    $endgroup$
    – An old man in the sea.
    Jan 20 at 13:55




















  • $begingroup$
    There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
    $endgroup$
    – greg
    Jan 20 at 3:46












  • $begingroup$
    @greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
    $endgroup$
    – An old man in the sea.
    Jan 20 at 13:55


















$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46






$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46














$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55






$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55












1 Answer
1






active

oldest

votes


















2





+50







$begingroup$

Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.



Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.



Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$

The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$

For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$

and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$

In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.



For example, in Julia (or Matlab) you can write



B = zeros(m,n)
for k = 1:p
B += A[k*m-m+1:k*m, k*n-n+1:k*n]
end


So this single for-loop will calculate the gradients shown above.






Update


Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);
Iss = kron(Is,Is); M = kron(Is,It[:,1]);
for k = 2:t; M = [ M; kron(Is,It[:,k]) ]; end
M = kron(M,Is);

x = kron(vec(It), Iss)*vec(dS);
y = vec(kron(It,dS));
z = M*vec(dS);

println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )
x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]
y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]
z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]


In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$






share|cite|improve this answer











$endgroup$













  • $begingroup$
    Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
    $endgroup$
    – An old man in the sea.
    Jan 20 at 21:46












  • $begingroup$
    Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
    $endgroup$
    – An old man in the sea.
    Jan 21 at 1:48










  • $begingroup$
    An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
    $endgroup$
    – greg
    Jan 21 at 5:36












  • $begingroup$
    Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
    $endgroup$
    – An old man in the sea.
    Jan 24 at 9:03






  • 1




    $begingroup$
    Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
    $endgroup$
    – greg
    Jan 29 at 21:40











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079445%2fderivation-of-frac-partial-partial-sigma-left-frac12-log-detm-o%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2





+50







$begingroup$

Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.



Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.



Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$

The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$

For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$

and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$

In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.



For example, in Julia (or Matlab) you can write



B = zeros(m,n)
for k = 1:p
B += A[k*m-m+1:k*m, k*n-n+1:k*n]
end


So this single for-loop will calculate the gradients shown above.






Update


Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);
Iss = kron(Is,Is); M = kron(Is,It[:,1]);
for k = 2:t; M = [ M; kron(Is,It[:,k]) ]; end
M = kron(M,Is);

x = kron(vec(It), Iss)*vec(dS);
y = vec(kron(It,dS));
z = M*vec(dS);

println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )
x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]
y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]
z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]


In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$






share|cite|improve this answer











$endgroup$













  • $begingroup$
    Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
    $endgroup$
    – An old man in the sea.
    Jan 20 at 21:46












  • $begingroup$
    Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
    $endgroup$
    – An old man in the sea.
    Jan 21 at 1:48










  • $begingroup$
    An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
    $endgroup$
    – greg
    Jan 21 at 5:36












  • $begingroup$
    Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
    $endgroup$
    – An old man in the sea.
    Jan 24 at 9:03






  • 1




    $begingroup$
    Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
    $endgroup$
    – greg
    Jan 29 at 21:40
















2





+50







$begingroup$

Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.



Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.



Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$

The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$

For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$

and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$

In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.



For example, in Julia (or Matlab) you can write



B = zeros(m,n)
for k = 1:p
B += A[k*m-m+1:k*m, k*n-n+1:k*n]
end


So this single for-loop will calculate the gradients shown above.






Update


Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);
Iss = kron(Is,Is); M = kron(Is,It[:,1]);
for k = 2:t; M = [ M; kron(Is,It[:,k]) ]; end
M = kron(M,Is);

x = kron(vec(It), Iss)*vec(dS);
y = vec(kron(It,dS));
z = M*vec(dS);

println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )
x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]
y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]
z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]


In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$






share|cite|improve this answer











$endgroup$













  • $begingroup$
    Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
    $endgroup$
    – An old man in the sea.
    Jan 20 at 21:46












  • $begingroup$
    Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
    $endgroup$
    – An old man in the sea.
    Jan 21 at 1:48










  • $begingroup$
    An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
    $endgroup$
    – greg
    Jan 21 at 5:36












  • $begingroup$
    Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
    $endgroup$
    – An old man in the sea.
    Jan 24 at 9:03






  • 1




    $begingroup$
    Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
    $endgroup$
    – greg
    Jan 29 at 21:40














2





+50







2





+50



2




+50



$begingroup$

Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.



Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.



Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$

The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$

For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$

and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$

In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.



For example, in Julia (or Matlab) you can write



B = zeros(m,n)
for k = 1:p
B += A[k*m-m+1:k*m, k*n-n+1:k*n]
end


So this single for-loop will calculate the gradients shown above.






Update


Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);
Iss = kron(Is,Is); M = kron(Is,It[:,1]);
for k = 2:t; M = [ M; kron(Is,It[:,k]) ]; end
M = kron(M,Is);

x = kron(vec(It), Iss)*vec(dS);
y = vec(kron(It,dS));
z = M*vec(dS);

println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )
x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]
y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]
z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]


In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$






share|cite|improve this answer











$endgroup$



Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.



Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.



Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$

The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$

For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$

and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$

In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.



For example, in Julia (or Matlab) you can write



B = zeros(m,n)
for k = 1:p
B += A[k*m-m+1:k*m, k*n-n+1:k*n]
end


So this single for-loop will calculate the gradients shown above.






Update


Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);
Iss = kron(Is,Is); M = kron(Is,It[:,1]);
for k = 2:t; M = [ M; kron(Is,It[:,k]) ]; end
M = kron(M,Is);

x = kron(vec(It), Iss)*vec(dS);
y = vec(kron(It,dS));
z = M*vec(dS);

println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )
x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]
y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]
z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]


In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Jan 22 at 13:32

























answered Jan 20 at 17:41









greggreg

8,2951823




8,2951823












  • $begingroup$
    Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
    $endgroup$
    – An old man in the sea.
    Jan 20 at 21:46












  • $begingroup$
    Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
    $endgroup$
    – An old man in the sea.
    Jan 21 at 1:48










  • $begingroup$
    An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
    $endgroup$
    – greg
    Jan 21 at 5:36












  • $begingroup$
    Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
    $endgroup$
    – An old man in the sea.
    Jan 24 at 9:03






  • 1




    $begingroup$
    Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
    $endgroup$
    – greg
    Jan 29 at 21:40


















  • $begingroup$
    Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
    $endgroup$
    – An old man in the sea.
    Jan 20 at 21:46












  • $begingroup$
    Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
    $endgroup$
    – An old man in the sea.
    Jan 21 at 1:48










  • $begingroup$
    An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
    $endgroup$
    – greg
    Jan 21 at 5:36












  • $begingroup$
    Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
    $endgroup$
    – An old man in the sea.
    Jan 24 at 9:03






  • 1




    $begingroup$
    Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
    $endgroup$
    – greg
    Jan 29 at 21:40
















$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46






$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46














$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48




$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48












$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36






$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36














$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03




$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03




1




1




$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40




$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079445%2fderivation-of-frac-partial-partial-sigma-left-frac12-log-detm-o%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Mario Kart Wii

What does “Dominus providebit” mean?

The Binding of Isaac: Rebirth/Afterbirth