西西河

主题:【原创】围绕脑科学而发生的若干玄想 -- 鸿乾

共:💬461 🌺824 🌵2
全看分页树展 · 主题 跟帖
家园 Hopfield neural network & BM

I will write a little more comment on the basic ideas behind Hopfield neural network & BM, the popular AI topys, now I have "scanned" that quoted paper.

again, very rough analogies to get through some basic but very important concepts of physics now widely applied in AI modeling, the way I see it.

1.

heatbath

a system normally has to develop into equilibrium with heatbath(environment, kind of, although it could be ghost, could be internal as well, or "challenges/futures" in general, etc)

"第零定律比起其他任何定律更為基本,但直到二十世紀三十年代前一直都未有察覺到有需要把這種現象以定律的形式表達。第零定律是由英國物理學家福勒(R.H.Fowler)於1930年正式提出,比热力学第一定律和热力学第二定律晚了80餘年,但是第零定律是后面几个定律的基础,所以叫做热力学第零定律。"

第零定律經常被認為可於建立一個溫度函數;更隨便的說法是可以製造溫度計。而這個問題是其中一個熱力學和統計力學哲學的題目。

在熱力學變量的函數空間之中,恒溫的部分會成為一塊面並會為附近的面提供自然秩序。

in china, basically TG=中央 heatbath, of 5k years already;

global environment, including science and technology, economic political dynamics=heatbath in general for all of us to struggle with.

2.

white physicists assume that at macroscopic level, in general and "short term", heatbath as we know is kind of stable, physics wise, so heatbath itself is normalized (markov etc), there is a canonical statistical mechanics model for that, and if we figure that out, we would know 溫度函數, we would have a 溫度計, and as a sub system,we just need to normalize (dynamic 弛豫 relaxation, exchange energy etc with heatbath) into this heatbath, or we will not survive;

3.

譜分佈

obviously, heatbath (and all the struggling sub systems inside or outside) is dynamic, jumping dancing around/near equilibrium state, with some kind of 譜.

(yes, there is this 普利高津教授/非线性化学领域/“耗散结构”理论,在非平衡系统中在与外界有着物质与能量的交换的情况下,系统 survive and prosper, etc. but because of lacking in math model and 实验证明, not an main steam theory yet in physics world.)

what kind of 譜? from the book you recommended:

"From Pythagoras's harmonic sequence to Einstein's theory of relativity, geometric models of position, proximity, ratio, and the underlying properties of physical space have provided us with powerful ideas and accurate scientific tools"

4.

physics into AI, and AI into social ideology (all kinds of) in general

first of all, why physics AI?

aside from energy partition function we talked about, physics 最小作用量原理, Feynman path integral etc, can help us much better in terms of gauging the future path of system, where math or social statistics are challenged. etc.

"Currently, similar geometric models are being applied to another type of space—the conceptual space of information and meaning, where the contributions of Pythagoras and Einstein are a part of the landscape itself."

as previously discussed, short of QM computer, AI at machine/OS level is very difficult, but you could build an AI operation system/kernel/apps atop the existing machine/OS: you could still have an AI network

"Rigorous results on the thermodynamics of the dilute ... - Springer

Journal of Statistical Physics, Vol. 72, Nos. 1/2, 1993. Rigorous Results on the Thermodynamics of the. Dilute Hopfield Model. Anton Bovier I and V~ronique ..."

boy, white evils have done this kind of research for over 20 years?

so, a global modern physics AI power AI layer atop the current global mobile internet is coming, with that, business, social culture, political ideology, are all going to be disrupted.

dear chairman X, stay with GFW, but please put vice chairman 李源潮 in charge of TG's science and technology, he is a math guy, possibly the only "science literate" person in 中央政治局.

what a world.

----heatbath concept-----

热力学第零定律[编辑]

维基百科,自由的百科全书

跳转至: 导航、 搜索

Tango-nosources.svg

本条目没有列出任何参考或来源。 (2013年10月3日)

維基百科所有的內容都應該可供查證。

请协助添加来自可靠来源的引用以改善这篇条目。无法查证的内容可能被提出异议而移除。

热力学

Carnot heat engine 2.svg

经典的卡诺热机

分支显示▼

定律显示▼

系统显示▼

系统性质显示▼

材料性质显示▼

c=

T \partial S

N \partial T

\beta=-

1 \partial V

V \partial p

\alpha=

1 \partial V

V \partial T

方程显示▼

势显示▼

U(S,V)

H(S,p)=U+pV

A(T,V)=U-TS

G(T,p)=H-TS

历史/文化显示▼

科学家显示▼

查 ·

论 ·

熱力學第零定律是一個關於互相接觸的物體在熱平衡時的描述,以及為溫度提供理論基礎。最常用的定律表述是:

若兩個熱力學系統均與第三個系統處於熱平衡狀態,此兩個系統也必互相處於熱平衡。

換句話說,第零定律是指:在一個數學二元關係之中,熱平衡是遞移的。

目录 [隐藏]

1 歷史

2 概要

3 多系統間之平衡

4 第零定律與溫度

5 参阅

歷史[编辑]

第零定律比起其他任何定律更為基本,但直到二十世紀三十年代前一直都未有察覺到有需要把這種現象以定律的形式表達。第零定律是由英國物理學家福勒(R.H.Fowler)於1930年正式提出,比热力学第一定律和热力学第二定律晚了80餘年,但是第零定律是后面几个定律的基础,所以叫做热力学第零定律。

概要[编辑]

一個熱平衡系統的宏觀物理性質(壓强、溫度、體積等)都不會隨時間而改變。一杯放在餐桌上的熱咖啡,由於咖啡正在冷卻,所以這杯咖啡與外界環境並非處於平衡狀態。當咖啡不再降溫時,它的溫度就相當於室溫,並且與外界環境處於平衡狀態。

兩個互相處於平衡狀態的系統會滿足以下條件:

1.兩者各自處於平衡狀態;

2.兩者在可以交換熱量的情況下,仍然保持平衡狀態。

進而推廣之,如果能夠肯定兩個系統在可以交換熱量的情況下物理質性也不會發生變化時,即使不容許兩個系統交換熱量,也可以肯定互為平衡狀態。

因此,熱平衡是熱力學系統之間的一種關係。數學上,第零定律表示這是一種等價關係。(技術上,需要同時包括系統自己亦都處於熱平衡。)

多系統間之平衡[编辑]

一個簡單例子可以說明為甚麼需要到第零定律。如前所述,當兩個系統間有小量廣延量交換時(如微觀波動)而兩者的總能量不變時(能量減少不能逆轉),此兩個系統即處於平衡。

簡單起見,N 個系統與宇宙的其他部分絕應隔離,每一個系統的體積與組成都保持恒定,而各個系統之間都只能交換熱量(熵)。此例子的結果可直接延伸至體積或積量的交換。

熱力學第一與第二定律的結合把總能量波動 \delta U 與第 i 個系統的溫度 T_i 及熵的波動 \delta S_i 聯繫成:

\delta U=\sum_i^NT_i\delta S_i

與宇宙其他部分絕熱隔離,N 個系統熵的總和必須為零。

\sum_i^N\delta S_i=0

換句話說,熵只能在 N 個系統之間交換。這個限制可以用來重寫總能量波動的表達式成:

\delta U=\sum_{i}^N(T_i-T_j)\delta S_i

T_j 是 N 個系統中任何一個系統 j 的溫度。最後到達平衡時,總能量波動必須為零,因此:

\sum_{i}^N(T_i-T_j)\delta S_i=0

這條方程式可被設想成反對稱矩陣 T_i-T_j 與熵波動向量之乘積為零。若要令一個非零解存在,則:

\delta S_i\ne 0

無論是那一個 j 的選擇,由 T_i-T_j 組成之矩陣的行列式值必定歸零。

但是,根據雅可比定理,一個 N×N 反對稱矩陣若N 為奇數時,則其行列式值必為零;而若 N 為偶數時,則每一項 T_i-T_j 必須為零以令行列式值為零,亦即各個系統處於平衡狀態 T_i=T_j。此結果顯示,奇數數目的系統必定處於平衡狀態,而各系統的溫度和熵波動則可以忽略不計;熵波動存在時,只有偶數數目的系統才須要各系統的溫度相等以達致平衡狀態。

熱力學第零定律解決了此奇偶矛盾。考慮 N 個系統中的任何三個互為平衡的系統,其中一個就系統可以按照第零定律而被忽略。因此,一個奇數數數的系統就可以約簡成一個偶數數目的系統。此推導使 T_i=T_j 為平衡的必須條例。

相同結果,可以應用到任何廣延量中的波動如體積(相同壓强)、或質量(相同化勢)。因而,第零定律的所涉及的就不單只是溫度罷了。

總的來說,第零定律打破了第一定律和第二定律內的某種反對稱性。

第零定律與溫度[编辑]

第零定律經常被認為可於建立一個溫度函數;更隨便的說法是可以製造溫度計。而這個問題是其中一個熱力學和統計力學哲學的題目。

在熱力學變量的函數空間之中,恒溫的部分會成為一塊面並會為附近的面提供自然秩序。之後,該面會簡單建立一個可以提供連續狀態順序的總體溫度函數。該恒溫面的維度是熱力學變量的總數減一(例如對於有三個熱力學變量 P、V、n 的理想氣體,其恒溫面是塊二維面)。按此定義的溫度實際上未必如攝氏溫度尺般,而是一個函數。

以理想氣體為例,若兩團氣體是處於熱平衡,則:

\frac{P_1 V_1}{N_1} = \frac{P_2 V_2}{N_2}

P_i 是第 i 個系統的壓力

V_i 是第 i 個系統的體積

N_i 是第 i 個系統的數量(摩爾數或者原子數目)

面 PV/N = const 定義了所有相同溫度的面,一個常見方法來標籤這些面是令 PV/N = RT,R 是一個常數而溫度 T 可以由此定義。經定義後,這些系統可用作溫度計來較準其他系統。

-------paper1----

http://abernacchi.user.jacobs-university.de/papers/bbsc12.pdf

In the HBM, parameters P and K determine the number of

neurons in the hidden layers, while in the Hopfield model they

represent the number of patterns stored in the network, or the

number of stable states that can be retrieved. We consider the

‘‘high storage’’ regime, in which the number of stored patterns is

linearly increasing with the number of neurons (Amit, 1992).

3.2. Free energy minimization and phase transition

We minimize the free energy (23) with respect to the order

parameters q, p, r.

To obtain the final equation for the partition function, we sum the

two Hamiltonians and divide by two, to find

ZI

σ

exp

β

4N

N

ij

αN

ν

ξ ν

i ξ ν

j

1 +

1

1 + β2γ

+

γ N

μ

ξμ

i ξμ

j

1 +

1

1 + β2α

.

Retaining only the first-order terms in , we obtain an equivalent

Hamiltonian for a HBM where the hidden layers interact.

This is the Hamiltonian of a Hopfield neural network. This result

connects the two Hamiltonians of the Hopfield network and the

Boltzmann Machine and states that thermodynamics obtained by

the first cost function, Eq. (6), is the same as the one obtained by

the second one, Eq. (11). This offers a connection between retrieval

through free energy minimization in the Hopfield network and

learning through log-likelihood estimation in the HBM (Amit,

1992; Bengio, 2009). Note that observable quantities stemming

from HBM are equivalent in distribution, and not pointwise, to the

corresponding ones in the Hopfield network.

Next, we calculate the free energy, which allows us to

determine the value of all relevant quantities and the different

phases of the system. The thermodynamic approach consists in

averaging all observable quantities over both the noise and the

configurations of the system.

a Hopfield model with an additional noise source,

characterized by the Hamiltonian

H(σ ; ξ , η) =

β

2N

N

ij

αN

ν

ξ ν

i ξ ν

j [1 β2γ /4]

+

γ N

μ

ξμ

i ξμ

j [1 β2α/4]

. (37)

Note that for = 0 we recover the standard Hopfield model. The

effect of the additional noise source on the retrieval of patterns

corresponding to one layer depends on the load of the other layer:

the larger the number of neurons in one layer, the larger the

perturbation on the retrieval of the other layer.

-----------another paper------

http://www.stieltjes.org/archief/biennial9596/frame/node22.html

Hopfield model for Neural Networks and Thermodynamic Limit

The type of investigations described above are also applied to analyse the dynamics of a Hopfield model for neural networks in [8].

The Hopfield model is the following neural network model for associative memory. We are given N neurons, each of which can be in state 0 or 1. We assume that the memory contains a given set of p images. At time t neuron i is selected with probability 1/N, and the new state of this neuron is determined according to conditional Gibbs probabilities with a given energy function, which we will not further specify. We consider only the zero temperature dynamics and then the new state of the neuron is deterministic and such that the energy of the new configuration does not increase. In our paper the energy function assigns lowest energy to the images themselves and so one expects that with probability 1 one of the images from memory is retrieved. This is not true: not only images where the energy has a global minimum, but also images where the energy has a local minimum can be retrieved. It is a well-known fact that global/local minima correspond to fixed points of the limiting dynamics.

Most research on this model (cf. for example [1]) deals with the domains of attraction of these fixed points, when the number of images grows in some prespecified way with the number of neurons N. Almost no results exist on the exact form of the dynamics in the thermodynamic limit tex2html_wrap_inline4448 . Nevertheless, to understand the quality of the model, the limiting dynamics are an important tool. It will give insight into questions like whether from any input image one of the images from memory are retrieved and how long it will take.

For analysing this problem we needed to reformulate the model as a Markov chain on a state space with a dimension independent of N. By using the commonly used overlap representation the Markov property gets lost and the understanding of the limiting dynamics becomes more complicated. Clearly, the obtained Markov chain still depends on N and the jump probabilities depend more strongly on the initial state than in the rw case. Therefore, for determining the limiting dynamics of the same time-space scaled process as above, a more general version of the LLN was required.

A surprising result was the existence of ``traps'': these are not fixed points, but nevertheless can be limit points of the limiting dynamics. Since non-trivial examples only occur in dimension at least 8, we show a generic example of the limiting dynamics in Figure 4: to each region corresponds a quasi-attractor attracting all images from this region. Hence an image is successively attracted by different quasi-attractors till it reaches a fixed point (A) or a ``trap'' (B).

=.5mm

0.4pt

picture668

Contrary to the random walk case where the ``speed'' along Euler paths is piecewise constant, the speed decreases exponentially while approaching the quasi-attractor. A trap is therefore never reached, although it is left immediately when it is reached. Translated back to the original system, it means that fixed points or traps are reached at a speed that is slower than linear in N.

The picture also shows the occurrence of scattering. Moreover, we have been able to prove that the limiting dynamics are acyclic and so bouncing back and forth between different quasi-attractors cannot happen.

The described analysis is a first start in my research in neural networks, which has been conducted the past year. The investigation of more complicated problems will be the next step. These problems concern the finite temperature dynamics; the number of fixed points; the question whether the time to reach an epsilon-distance of a local/global minimum or trap is uniformly bounded in the number of images; the dynamics when the number of images grows with the number of neurons.

全看分页树展 · 主题 跟帖


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河