Derivation of $frac{partial }{partial Sigma} left(-frac{1}{2}log(det(Motimes K +I

Derivation of $frac{partial }{partial Sigma} left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma)) right)$

I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.

For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.

Also, $U_{MSigma}$ is symmetric and positive definite.

We know that

begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(-frac{1}{2}U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}text{vec}(U_{MSigma}^{-1})^intercal text{vec}(I_Totimes dSigma))
end{align*}

and we notice that

begin{align*}
text{vec}(I_Totimes dSigma)&=left[
begin{array}{c}
text{vec}(e_1otimes dSigma) \
vdots\
vec(e_Totimes dSigma)
end{array}
right]=left[
begin{array}{c}
((e_1otimes I_D)otimes I_D) text{vec}(dSigma) \
vdots\
((e_1otimes I_D)otimes I_D) text{vec}(dSigma)
end{array}
right]\
&=(text{vec}(I_T)otimes I_{D^2}) text{vec}(dSigma)
end{align*}

Therefore we have
begin{align*}
dleft(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)&= Tr(U_{MSigma}^{-1}dU_{MSigma})\
&=Tr(-frac{1}{2}vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2}) vec(dSigma))\
&=Tr(-frac{1}{2}(Gamma_{Sigma})^intercal dSigma)
end{align*}

Such that $Gamma_{Sigma}$ is defined by $vec(Gamma_{Sigma})^intercal=vec(U_{MSigma}^{-1})^intercal (vec(I_T)otimes I_{D^2})$, and $frac{partial}{partial Sigma}left(-frac{1}{2}log(det(Motimes K +I_T otimes Sigma))right)=-frac{1}{2}Gamma_{Sigma}$.

Extra Points for help with the derivation below:

begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=-v^intercal (-U_{MSigma})^{-1}(dU_{MSigma})U_{MSigma}^{-1}v\
&=v^intercal U_{MSigma}^{-1}(I_Totimes dSigma)U_{MSigma}^{-1}v\
&=v_{Sigma}^intercal U_{MSigma}^{-1}vec(dSigma W_{MSigma} I_T)
end{align*}

where $W_{MSigma}$ is defined by being conformable and $vec(W_{MSigma}) = U_{MSigma}^{-1}v$. Therefore, we have
begin{align*}
d(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)&=Tr(W_{MSigma}^intercal dSigma W_{MSigma} I_T)\
&=Tr(W_{MSigma} W_{MSigma}^intercal dSigma)
end{align*}

And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.

Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.

Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$

$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46

$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55

add a comment |

I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.

For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.

Also, $U_{MSigma}$ is symmetric and positive definite.

We know that

and we notice that

Extra Points for help with the derivation below:

And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.

Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.

Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$

$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46

$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55

add a comment |

I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.

For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.

Also, $U_{MSigma}$ is symmetric and positive definite.

We know that

and we notice that

Extra Points for help with the derivation below:

And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.

Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.

Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$

$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

I've done below a derivation, and I'm wondering if it's correct. If you help me with both derivations, I'll throw in some bonus/bounty points.

For ease of notation let's define $U_{MSigma}=Motimes K +I_T otimes Sigma$.

Also, $U_{MSigma}$ is symmetric and positive definite.

We know that

and we notice that

Extra Points for help with the derivation below:

And we have $frac{partial}{partial Sigma}(-v^intercal (Motimes K+I_Totimes Sigma)^{-1}v)=W_{MSigma}^intercal W_{MSigma}$.

Now if we define $L= -frac{1}{2}log(det(U_{MSigma}) -v^intercal (U_{MSigma})^{-1}v$
$$frac{partial L}{partial Sigma}=-frac{1}{2}Gamma_{Sigma}+W_{MSigma}^intercal W_{MSigma}$$.

Also, if $Sigma=text{Diag}(e^{tilde{sigma^2}_i})$

$$frac{partial L}{partial Sigma}frac{partial Sigma}{partial (tilde{sigma_i^2})} = text{diag}left((frac{partial}{partial Sigma}L)^intercal text{Diag}(e^{tilde{sigma_i^2}})right).$$

proof-verification matrix-calculus

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

edited Feb 1 at 13:43

Martin Sleziak

44.8k10118272

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

asked Jan 19 at 15:13

An old man in the sea.

1,64511134

$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46

$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55

add a comment |

$begingroup$
There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.
$endgroup$
– greg
Jan 20 at 3:46

$begingroup$
@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry
$endgroup$
– An old man in the sea.
Jan 20 at 13:55

There is a problem with the 2nd line of the gradient. Note that $$eqalign{ {rm tr}(A,dX) &= A^T:dX cr &= vec(A^T):vec(dX) cr &= vec(A^T)^T,vec(dX) cr &= (K,vec(A))^T,vec(dX) cr &= vec(A)^TK^T,vec(dX) cr }$$ where $K$ is the Commutation matrix associated with Kronecker product.

– greg
Jan 20 at 3:46

@greg Many thanks for the help. I'm not sure I understood your comment. I forgot to state that $U_{MSigma}$ is symmetric and positive definite.Sorry

– An old man in the sea.
Jan 20 at 13:55

add a comment |

1 Answer
1

active

oldest

votes

+50

Instead of vectorization, take advantage of the block structure of your matrix by introducing a block version of the diag() operator
$$B_k={rm bldiag}(M,k,n)$$
which extracts the $k^{th}$ block along the diagonal of $M,$ where $1le kle n.$

The dimension of the block is $tfrac{1}{n}$ of the corresponding dimension of the parent matrix.

Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.

Back to your specific problem. To reduce the clutter let's drop most subscripts, ignore the scalar factors, rename the variable $Sigmarightarrow S$ so it's not confused with summation, and rename $Trightarrow n$ so as not to confuse it with the transpose operation.
$$eqalign{
U &= Motimes K + I_notimes S cr
phi &= logdet U cr
dphi &= d{,rm tr}(log U) cr
&= U^{-T}:dU cr
&= U^{-T}:(I_notimes dS) cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T},k,nbig):{rm bldiag}big((I_notimes dS),k,nbig)
cr
&= sum_{k=1}^nB_k:dS cr
&= B:dS cr
frac{partialphi}{partial S} &= B crcr
}$$
The second problem is quite similar.
$$eqalign{
W &= vv^T cr
psi &= -W:U^{-1} cr
dpsi
&= W:U^{-1},dU,U^{-1} cr
&= U^{-T}WU^{-T}:dU cr
&= sum_{k=1}^n{rm bldiag}big(U^{-T}WU^{-T},k,nbig):dS cr
&= C:dS cr
frac{partialpsi}{partial S} &= C crcr
}$$
For coding purposes, assume you have
$$eqalign{
A&in{mathbb R}^{pmtimes pn} cr
}$$
and you wish to calculate the sum of the block diagonals, i.e.
$$eqalign{
B &= sum_{k=1}^p{rm bldiag}(A,k,p) quadin {mathbb R}^{mtimes n} cr
}$$
In almost all programming languages you can access a sub-matrix using index ranges, so you don't need to waste RAM creating vectors and matrices to hold intermediate results.

For example, in Julia (or Matlab) you can write

B = zeros(m,n)

for k = 1:p

  B += A[k*m-m+1:k*m, k*n-n+1:k*n]

end

So this single for-loop will calculate the gradients shown above.

Update

Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);

Iss = kron(Is,Is); M = kron(Is,It[:,1]);

for k = 2:t; M = [ M;  kron(Is,It[:,k]) ]; end

M = kron(M,Is);



x = kron(vec(It), Iss)*vec(dS);

y = vec(kron(It,dS));

z = M*vec(dS);



println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )

x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]

y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

In symbols
$$eqalign{
M &= pmatrix{I_sotimes e_1cr I_sotimes e_2crldotscr I_sotimes e_t}
otimes I_s cr
x &= big({rm vec}(I_t)otimes I_{s^2}big),{rm vec}(dS) cr
y &= {rm vec}(I_totimes dS) cr
z &= M,{rm vec}(dS) cr
x &ne y = z cr
}$$

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46

$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48

$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36

$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03

1

$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40

|
show 4 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079445%2fderivation-of-frac-partial-partial-sigma-left-frac12-log-detm-o%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

+50

Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.

For example, in Julia (or Matlab) you can write

B = zeros(m,n)

for k = 1:p

  B += A[k*m-m+1:k*m, k*n-n+1:k*n]

end

So this single for-loop will calculate the gradients shown above.

Update

Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);

Iss = kron(Is,Is); M = kron(Is,It[:,1]);

for k = 2:t; M = [ M;  kron(Is,It[:,k]) ]; end

M = kron(M,Is);



x = kron(vec(It), Iss)*vec(dS);

y = vec(kron(It,dS));

z = M*vec(dS);



println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )

x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]

y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46

$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48

$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36

$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03

1

$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40

|
show 4 more comments

+50

Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.

For example, in Julia (or Matlab) you can write

B = zeros(m,n)

for k = 1:p

  B += A[k*m-m+1:k*m, k*n-n+1:k*n]

end

So this single for-loop will calculate the gradients shown above.

Update

Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);

Iss = kron(Is,Is); M = kron(Is,It[:,1]);

for k = 2:t; M = [ M;  kron(Is,It[:,k]) ]; end

M = kron(M,Is);



x = kron(vec(It), Iss)*vec(dS);

y = vec(kron(It,dS));

z = M*vec(dS);



println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )

x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]

y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46

$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48

$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36

$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03

1

$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40

|
show 4 more comments

+50

Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.

For example, in Julia (or Matlab) you can write

B = zeros(m,n)

for k = 1:p

  B += A[k*m-m+1:k*m, k*n-n+1:k*n]

end

So this single for-loop will calculate the gradients shown above.

Update

Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);

Iss = kron(Is,Is); M = kron(Is,It[:,1]);

for k = 2:t; M = [ M;  kron(Is,It[:,k]) ]; end

M = kron(M,Is);



x = kron(vec(It), Iss)*vec(dS);

y = vec(kron(It,dS));

z = M*vec(dS);



println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )

x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]

y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

Also note that for any value of $k,,,{rm bldiag}big((I_Totimes Sigma),k,Tbig) = Sigma,$

and further, these are the only non-zero blocks in the entire matrix.

For example, in Julia (or Matlab) you can write

B = zeros(m,n)

for k = 1:p

  B += A[k*m-m+1:k*m, k*n-n+1:k*n]

end

So this single for-loop will calculate the gradients shown above.

Update

Here is a little Julia (v $0.6$) script to check the various "vec-kronecker" expansions.

s,t = 2,3; dS = 4*rand(s,s); dS += dS'; It = eye(t); Is = eye(s);

Iss = kron(Is,Is); M = kron(Is,It[:,1]);

for k = 2:t; M = [ M;  kron(Is,It[:,k]) ]; end

M = kron(M,Is);



x = kron(vec(It), Iss)*vec(dS);

y = vec(kron(It,dS));

z = M*vec(dS);



println( "x = $(x[1:9])ny = $(y[1:9])nz = $(z[1:9])" )

x = [3.07674, 5.3285, 5.3285, 3.51476, 0.0, 0.0, 0.0, 0.0, 0.0]

y = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

z = [3.07674, 5.3285, 0.0, 0.0, 0.0, 0.0, 5.3285, 3.51476, 0.0]

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

edited Jan 22 at 13:32

answered Jan 20 at 17:41

greg

8,2951823

answered Jan 20 at 17:41

greg

8,2951823

answered Jan 20 at 17:41

greg

8,2951823

$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46

$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48

$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36

$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03

1

$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40

|
show 4 more comments

$begingroup$
Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)
$endgroup$
– An old man in the sea.
Jan 20 at 21:46

$begingroup$
Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.
$endgroup$
– An old man in the sea.
Jan 21 at 1:48

$begingroup$
An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$
$endgroup$
– greg
Jan 21 at 5:36

$begingroup$
Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)
$endgroup$
– An old man in the sea.
Jan 24 at 9:03

1

$begingroup$
Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.
$endgroup$
– greg
Jan 29 at 21:40

Greg, thanks for your answer. I'll need to study it carefully. However, I'm really interested in knowing whether I got a correct derivation above. The reason why is that it's part of a programme I coded, and rederiving the gradient would mean too much work in the code...I'll give you a bounty for the work. ;)

– An old man in the sea.
Jan 20 at 21:46

Greg, the thanks for the extra info. However, I'm really, really interested in knowing if what I got is right. I hope you can help me with that.

– An old man in the sea.
Jan 21 at 1:48

An easy way to check is to pick random $(M,,K,,Sigma,,dSigma)$ matrices and calculate the gradient using your formula and my formula. But I think there's a problem in your derivation with the expansion of this term $$eqalign{ M &= pmatrix{I_Dotimes e_1cr I_Dotimes e_2crldotscr I_Dotimes e_T}otimes I_D cr cr {rm vec}(I_Totimes dSigma) &= M,{rm vec}(dSigma) cr }$$

– greg
Jan 21 at 5:36

Many thanks greg! I've already created the bounty of 50 points. I can only award it in 23 hours. If, by then, I haven't, send me a message. ;)

– An old man in the sea.
Jan 24 at 9:03

Yes, although I would write it using a Hadamard product as $e^sodot{rm diag}(B),,$ which can be more computationally efficient for large dimensions.

– greg
Jan 29 at 21:40

|
show 4 more comments

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

9v,p,X,sn bFUrxBxho

搜尋此網誌

Vtgyjfy