## The Art of Doing Science and Engineering

Education is what, when, and why to do things, Training is how to do it.

In science if you know what you are doing you should not be doing it. In engineering if you do not know what you are doing you should not be doing it.

All of engineering involves some creativity to cover the parts not known, and almost all of science includes some practical engineering to translate the abstractions into practice.
Often it is not physical limitations which control but rather it is human made laws, habits, and organizational rules, regulations, personal egos, and inertia, which dominate the evolution to the future.

you must try to foresee the future you will face. To illustrate the importance of this point of trying to foresee the future I often use a standard story. It is well known the drunken sailor who staggers to the left or right with n independent random steps will, on the average, end up about steps from the origin. But if there is a pretty girl in one direction, then his steps will tend to go in that direction and he will go a distance proportional to n. In a lifetime of many, many independent choices, small and large, a career with a vision will get you a distance proportional to $$\sqrt{n}$$, while no vision will get you only the distance In a sense, the main difference between those who go far and those who do not is some people have a vision and the others do not and therefore can only react to the current events as they happen.

To what extent history does or does not repeat itself is a moot question. But it is one of the few guides you have, hence history will often play a large role in my discussions—I am trying to provide you with some perspective as a possible guide to create your vision of your future.

In forming your plan for your future you need to distinguish three different questions: What is possible? What is likely to happen? What is desirable to have happen? In a sense the first is Science—what is possible. The second in Engineering—what are the human factors which chose the one future that does happen from the ensemble of all possible futures. The third, is ethics, morals, or what ever other word you wish to apply to value judgments.

Lastly, in a sense, this is a religious course—I am preaching the message that, with apparently only one life to live on this earth, you ought to try to make significant contributions to humanity rather than just get along through life comfortably—that the life of trying to achieve excellence in some area is in itself a worthy goal for your life.

It has often been observed the true gain is in the struggle and not in the achievement—a life without a struggle on your part to make yourself excellent is hardly a life worth living.
Notice I leave it to you to pick your goals of excellence, but claim only a life without such a goal is not really living but it is merely existing—in my opinion. In ancient Greece Socrates (469–399) said: The unexamined life is not worth living.
Indeed, one of the major items in the conversion from hand to machine production is the imaginative redesign of an equivalent product. Thus in thinking of mechanizing a large organization, it won’t work if you try to keep things in detail exactly the same, rather there must be a larger give-and-take if there is to be a significant success.

You must get the essentials of the job in mind and then design the mechanization to do that job rather than trying to mechanize the current version—if you want a significant success in the long run.
We must not forget, in all the enthusiasm for computer simulations, occasionally we must look at Nature as She is.

The Buddha told his disciples, "Believe nothing, no matter where you read it, or who said it, no matter if I have said it, unless it agrees with your own reason and your own common sense". I say the same to you—you must assume the responsibility for what you believe.

"Almost everyone who opens up a new field does not really understand it the way the followers do". The evidence for this is, unfortunately, all too good.

It has been said in physics no creator of any significant thing ever understood what he had done. I never found Einstein on the special relativity theory as clear as some later commentators.

The reason this happens so often is the creators have to fight through so many dark difficulties, and wade through so much misunderstanding and confusion, they cannot see the light as others can, now the door is open and the path made easy.

Hence I expect a lot of trouble until we do understand human communication via natural languages. Of course, the problem of human-machine is significantly different from humanhuman communication, but in which ways and how much seems to be not known nor even sought for. Until we better understand languages of communication involving humans as they are (or can be easily trained) then it is unlikely many of our software problems will vanish.

There are many things we can do to reduce "the software problem", as it is called, but it will take some basic understanding of language as it is used to communicate understanding between humans, and between humans and machines, before we will have a really decent solution to this costly problem. It simply will not go away.

"Is programming closer to novel writing than it is to classical engineering?" I suggest yes!

Give the same complex problem to two modern programmers and you will, I claim, get two rather different programs. Hence my belief current programming practice is closer to novel writing than it is to engineering. The novelists are bound only by their imaginations, which is somewhat as the programmers are when they are writing software.

What you learn from others you can use to follow; What you learn for yourself you can use to lead.

Mathematics is nothing but clear thinking. Mathematics is the language of clear thinking.

Platonic school (most) formalists When rigor enters, meaning departs. logical school intuitionists constructionists fallacies: 1) we do not actually "prove" theorems! 2) many important programming problems cannot be defined sharply enough so a proof can be given, rather the program which emerges defines the problem!

Man is not a rational animal, he is a rationalizing animal.

I often suspect creativity is like sex; a young lad can read all the books you have on the topic, but without direct experience he will have little chance of understanding what sex is—but even with experience he may still not understand what is going

An expert is one who knows everything about nothing; A generalist knows nothing about everything. In an argument between a specialist and a generalist the expert usually wins by simply (1) using unintelligible jargon, and (2) citing their specialist results which are often completely irrelevant to the discussion.

All impossibility proofs must rest on a number of assumptions which may or may not apply in the particular situation. "If an expert says something can be done he is probably correct, but if he says it is impossible then consider getting another opinion."

What you did to become successful is likely to become counterproductive when applied at a later date. "There is never time to do the job right, but there is always time to fix it later." especially in computer software.

Hamming’s rule: 90% of the time the next independent measurement will fall outside the previous 90% confidence limits!

Most of the time each person is immersed in the details of one special part of the whole and does not think of how what they are doing relates to the larger picture. It is characteristic of most people they keep a myopic view of their work and seldom, if ever, connect it with the larger aims they will admit, when pressed hard, are the true goals of the system. This myopic view is the chief characteristic of a bureaucrat.

Systems engineering is the attempt to keep at all times the larger goals in mind and to translate local actions into global results.But there is no single larger picture

The first rule of systems engineering is: If you optimize the components you will probably ruin the system performance.

Rule 2: Part of systems engineering design is to prepare for changes so they can be gracefully made and still not degrade the other parts.

Rule 3: The closer you meet specifications the worse the performance will be when overloaded.

Westerman believes, as I do, while the client has some knowledge of his symptoms, he may not understand the real causes of them, and it is foolish to try to cure the symptoms only. Thus while the systems engineers must listen to the client they should also try to extract from the client a deeper understanding of the phenomena. Therefore, part of the job of a systems engineer is to define, in a deeper sense, what the problem is and to pass from the symptoms to the causes.

The deeper, long term understanding of the nature of the problem must be the goal of the system engineer, whereas the client always wants prompt relief from the symptoms of his current problem. Again, a conflict leading to a meta systems engineering approach!

You may think the title means if you measure accurately you will get an accurate measurement, and if not then not; but it refers to a much more subtle thing—the way you choose to measure things controls to a large extent what happens.

For example, in school it is easy to measure training and hard to measure education, and hence you tend to see on final exams an emphasis on the training part and a great neglect of the education part

## Real Time Shading Model (4)

T为Strand的切线向量，N为当前Strand的最大法线向量，V是视点方向，L是光照方向。所以Specular分量为：

$$J_s= C_s \cdot (\sqrt{1 – (T\cdot H)^2})^e$$

$$J_d = saturate( \frac{ N\cdot L + w }{ (1 + w )^2 }$$

$$J_d=max(0,0.75*N\cdot L + 0.25)$$

  float HairDiffuseTerm(float3 N, float3 L) { return saturate(0.75 * dot(N, L) + 0.25); }   float HairSingleSpecularTerm(float3 T, float3 H, float exponent) { float dotTH = dot(T, H); float sinTH = sqrt(1.0 - dotTH*dotTH); return pow(sinTH, exponent); }   float3 ShiftTangent(float3 T, float3 N, float shiftAmount) { return normalize(T + shiftAmount * N); }   float4 main(PS_INPUT i) : COLOR { // shift tangents float shiftTex = tex2D(tSpecularShift, i.texCoord) - 0.5;   float3 N = normalize (i.normal);   float3 T1 = ShiftTangent(i.tangent, N, specularShift.x + shiftTex); float3 T2 = ShiftTangent(i.tangent, N, specularShift.y + shiftTex);   // diffuse term float3 diffuse = hairBaseColor * diffuseLightColor * HairDiffuseTerm(N, i.lightVec);   // specular term float3 H = normalize(i.lightVec + i.viewVec); float3 specular = specularColor0 * HairSingleSpecularTerm(T1, H, specularExp.x);   float3 specular2 = specularColor1 * HairSingleSpecularTerm(T2, H, specularExp.y);   // modulate secondary specular term with noise float specularMask = tex2D(tSpecularMask, i.texCoord * 10.0f); specular2 *= specularMask;   // specular attenuation for hair facing away from light float specularAttenuation = saturate(1.75 * dot(N, i.lightVec) + 0.25);   specular = (specular + specular2) * specularLightColor * specularAttenuation;   // read base texture float base = tex2D(tBase, i.texCoord);   // combine terms for final output float4 o;   o.rgb = (diffuse + ambientLightColor * hairBaseColor) * base; base = 1.5 * base - 0.5; o.rgb += specular * base; //o.rgb *= i.ambOcc; o.a = tex2D (tAlpha, i.texCoord); // read alpha texture   return o;   return i.ambOcc; }

## Real Time Shading Model (3)

• Punctual Light，假定光源是无穷远无穷小的，这样只用光源颜色和方向就可以完全定义了。
• HDR Lighting，光照的结果必须有足够高精度，某种基于物理模型就没有意义了
• Linear Pipeline，必须使用线性渲染管线，Gamma的处理必须正确，参数纹理的颜色空间要统一。

Normalized Blinn-Phong

$$I_{o}=I_{i}((N\cdot L) k_dC_d + k_sC_s(N\cdot H)^e) )$$

Ashikhmin-Shirley

Specular和Diffuse如下：

$$J_s=\frac{\sqrt{(n_u+1)(n_v+1)}}{8\pi}\frac{N\cdot H\frac{n_u (H\cdot T)^2+n_v(H\cdot B)^2}{1-(H\cdot N)^2}}{(V\cdot H)\cdot max(N\cdot L,N\cdot V)}F_r((V\cdot H),C_s)$$

$$J_d=\frac{28}{23}C_d(1-C_s)(1-(1-\frac{N\cdot V}{2})^5)(1-(1-\frac{N\cdot L}{2})^5)$$

$$J_s=\frac{n+1}{8\pi}\frac{(N\cdot H)^n}{V\cdot H \cdot max(N\cdot L,N\cdot V)}F_r(V\cdot H,C_s)$$

$$F_r(u,\rho_s)=\rho_s + ( 1-\rho_s)(1-u)^5$$

## Real Time Shading Model (2)

• 假定表面是由一些连续的对称的V型凹槽构成的，每一个凹槽由两个对立的面构成
• 每一个微表面的面积远大于光的波长，所以不用考虑衍射之类的问题
• 每一个微表面是一个Lambertian面，有相同的Diffuse反射系数
• 微表面的面积相比物体整个表面积非常小，所以每个像素可以覆盖足够多的微表面，可以使用统计分布模型建模

Oren-Nayar有分析模型，但是由于计算过于复杂，一般使用的是简化模型：

$$J_{d}=C_{d}(A+B\cdot \max(0,(V^{‘},L^{‘}))\cdot\sin\alpha\tan\beta$$

$$A=1-\frac{1}{2}\frac{\sigma^{2}}{(\sigma^2+0.33)}$$

$$B=0.45\frac{\sigma^{2}}{\sigma^2+0.09}$$

$$\alpha=\max(\theta_i,\theta_o),\beta=\min(\theta_i,\theta_o)$$

$$V^{‘},L^{‘}$$分别是视点向量和光照方向向量在切平面的投影。

Strauss

$$J_d=r_d (1-ms)C, r_d =(1-s^3)(1-t)$$

Diffuse分量会随Smoothness，Transparency，Metalness增大而减小.这在物理上是正确的，但是一个缺点是美术会发现自己画的Diffuse纹理色和最后实际的Diffuse颜色差别很大，不好控制。

$$J_s=\frac{1}{(N,L)}r_s C_s$$

$$r_s=r_j(V,R)^e,e=\frac{3}{1-s}$$

$$r_j=\min(1,r_n+(rn+k_j)j$$

$$k_j=0.1,r_n=(1-t)-r_d$$

j是Fresnel系数和几何衰减系数的近似函数，注意都是直接以角度为函数参数的。

$$j=F(\theta_i)G(\theta_i)G(\theta_o)$$

$$F(\alpha)=\frac{\frac{1}{(2\alpha/\pi-k_f)^2}-\frac{1}{k_{f}^2}}{\frac{1}{(1-k_f)^2}-\frac{1}{k_{f}^2}},G(\alpha)=\frac{\frac{1}{(1-k_g)^2}-\frac{1}{(2\alpha/\pi-k_g)^2}}{\frac{1}{(1-k_g)^2}-\frac{1}{k_{g}^2}}$$

$$C_s=C_w+m(1-F(\theta_i))(C-C_w)$$

## Real Time Shading Model (1)

1.基本模型

a.Original

b. Diffuse Only:

c. Specular Only:

$$I_{o}=I_{i}(N\cdot L)( k_sJ_s+k_dJ_d )$$

Lambert：最古老的模型，只是建模理想漫反射表面，即假定物体表面全部是平滑的，每个方向的Diffuse反射量是一致的。没有Specular分量。

Phong：1975年发布，属于简单的经验模型，加入了高光分量的计算。Phong模型的高光有强烈的塑料感。并且会有强度突变的问题。不能很好的兼容法线贴图。

$$I_{o}=I_{i}((N\cdot L) k_dC_d + k_sC_s(V\cdot R)^e) )$$

Blinn-Phong：用N.H替代了V.R来计算高光强度。H是半角向量，即入射光方向和观察者视线方向的平均值，Blinn-Phong效率更高，因为H只用每盏光计算一次，而R则是每一点都需要重新计算，并且$$(N\cdot H)^{4e} \approx (V\cdot R)^{e}$$。Blinn-Phong的理论基础是Microfacet理论，即认为物体表面分布的大量微表面会向视点方向反射光线，这些微表面的法线和H方向一致的时候反射光线才是有效的。

$$I_{o}=I_{i}((N\cdot L) k_dC_d + k_sC_s(N\cdot H)^e) )$$

Phong vs Blinn-Phong

$$J_{s}=\frac{F(L,H)G(L,V,H)D(H)}{4(N\cdot L)(N\cdot V)}$$

1.概念一致性

2.视野

3.原型迭代

4.需求收集与改进

5.反馈速度

6.并行编辑

## 关于Unicode和字符串处理的一些笔记

Pre-Unicode

1.ANSI 标准

2. DBCS或者MBCS标准

3.用MBCS支持国际化要考虑的问题

• 使用_mbsinc和_mbsdec去做字符串指针的迭代操作
• 用_mblen获得字符串长度，比如下面的代码是扫描转义字符的，我们需要对字符串数组进行索引操作，就需要考虑到变长的编码
while ( rgch[ i ] != '\\' ) i += _mbclen ( rgch + i );
• 用_getmbcp 获得当前的CodePage
• 使用_mbscpy拷贝字符串
while( *sz2 ) { //注意如果不是X，那么要使用_mbccmp！！ if( *sz2 != 'X' ) { _mbscpy( sz1, sz2 ); sz1 = _mbsinc( sz1 ); sz2 = _mbsinc( sz2 ); } else sz2 = _mbsinc( sz2 ); }
• 正确的字符串比较方式是用系统函数进行，if( !_mbccmp( sz1, sz2) )，如果知道比较的是ANSI字符，则可以直接进行，但是不建议这样做
• 小心Buffer Overflow的陷阱

cb = 0; while( cb < sizeof( rgch ) ) rgch[ cb++ ] = *sz++;

cb = 0; while( (cb + _mbclen( sz ))

Unicode

Unicode是一种希望一种编码搞定地球所有语言字符的标准。但是由于种种原因，Unicode也经历了多种版本才总算把有多少字符搞明白。以至于到后来的标准和前面完全就不一致了。1988年Becker发布了第一版Unicode草案，提出用16 bit来表示所有的字符（也就是当时人们认为65535个字符就够地球人用了），91年正式标准发布，也就是UCS-2标准，有很多新的系统和框架使用了这种标准，比如QT，Widnows NT，Java等。但是实际使用后人们才发现，16位根本不够。96年，又出现了UTF-16标准。允许2-4个字节的编码，到目前为止一共有109449个字符，其中CJK占了74500个.

• UCS-2，把所有的字符用2 Bytes保存，在Windows下wchar_t就是UCS-2
• UTF-16是用2-4 Bytes来保存，为了效率，兼容Big/Little Endian，通过一开头的FE FF Mask位来指定当前文档的字节序。UTF-16是UCS-2的超集。UTF-16的一个字符，有可能在高8位或者低8位上等于0×0，所以不能兼容ANSI标准。这也是Thompson发明UTF-8的主要原因之一
• UTF-8，是Ken Thompson在贝尔实验室参与Plan 9系统开发的时候设计的，<128的CodePoint用一个Byte来编码，然后大于的依次根据其大小用2-6个Byte进行编码。因为兼容ANSI，并且只要做比较少的改动，就能正确的处理多语言支持的问题，所以是Unix/Linux/Web的主流编码格式
Bits Hex Min Hex Max Byte Sequence in Binary   1 7 00000000 0000007f 0vvvvvvv   2 11 00000080 000007FF 110vvvvv 10vvvvvv   3 16 00000800 0000FFFF 1110vvvv 10vvvvvv 10vvvvvv   4 21 00010000 001FFFFF 11110vvv 10vvvvvv 10vvvvvv 10vvvvvv   5 26 00200000 03FFFFFF 111110vv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv   6 31 04000000 7FFFFFFF 1111110v 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv

• UTF-8和UTF-16都是变长的编码标准，最大都是4个Bytes（UTF-8定义的时候还有5，6Bytes的情况，不过现在这部分还没有用上，所以可以认为是1-4 Bytes）
• UTF-8是Endianness的，UTF-16分为UTF-16LE和UTF-16BE
• wchar_t在一些平台上是2 Bytes，在一些上是4 Bytes
• UTF-16编码纯中文字符比UTF-8要小一些（2 Bytes，而UTF-8需要3 Bytes），但是一般情况下游戏数据文件还是ANSI字符占大头。所以使用UTF-8编码还是能有效节省带宽和内存的
• 任何byte oriented的字符串搜索算法都可以直接用于UTF-8,编码方式保证了一个字符的字节序是唯一的
• UTF-8不会出现FE FF，也就是不会和UTF-16混淆
• UTF-8可以有BOM用来标志这是一个UTF-8文档，但是现在也有很多文档选择忽略这个标记
• UTF-8可以编码任何Unicode字符，不需要考虑CodePage之类的，能够同时输出到多种语言
• 直接把UTF-8字符串当作unsigned byte进行排序的结果和将其当作Unicode CodePoints进行排序，结果是一致的，这是很方便的设计！
• 没有所谓的Plain Text，必须知道当前的字符串是用什么方式编码的，才能正确的解析。举个例子，对于网页解析来说，首先服务器上由于托管了大量的页面，编码各有不同，所以不能由服务器发编码，那么就交给客户端的浏览器去决定，客户端会先读取页面，去寻找ContentType的字段获得页面的编码，然后再重新Parse整个页面，如果找不到编码方式，不同的浏览器有不同的处理方法，比如IE，就会根据词频之类的特征，猜测当前页面最可能的编码

4.UTF-8实际使用中要考虑的问题

• 工程全部使用UNICODE编译，避免错误的把Narrow String传给Windows API
• 除非特别的指定，所有的std::string和char*都当作UTF-8编码处理
• 可以考虑使用Boost.Locale等高质量的第三方库
• 上层逻辑代码不使用wchar_t，_T(),L”"等，只在底层使用
• 由于MSVC的fstream等文件类，不能支持UTF-8编码的文件名，所以只能使用一个非标准的扩展，就是把UTF-8的文件名转换成UTF-16传给fstream，所以对于文件操作，还是由底层进行封装比较好，比如使用_wfopen_s()。
template <typename octet_iterator> uint32_t next(octet_iterator& it) { uint32_t cp = internal::mask8(*it); typename std::iterator_traits<octet_iterator>::difference_type length = utf8::internal::sequence_length(it); switch (length) { case 1: break; case 2: it++; cp = ((cp << 6) & 0x7ff) + ((*it) & 0x3f); break; case 3: ++it; cp = ((cp << 12) & 0xffff) + ((internal::mask8(*it) << 6) & 0xfff); ++it; cp += (*it) & 0x3f; break; case 4: ++it; cp = ((cp << 18) & 0x1fffff) + ((internal::mask8(*it) << 12) & 0x3ffff); ++it; cp += (internal::mask8(*it) << 6) & 0xfff; ++it; cp += (*it) & 0x3f; break; } ++it; return cp; }
• 所有的数据文件也都使用UTF-8编码
• 代码中不直接出现UTF-8字符串，全部从文件读取，避免源代码文件编码问题

## Vision and Art ,The Biology of Seeing 读书笔记(2)

• 三原色，三色中的任何一色，都不能用另外两种原色混合产生，而其他色可由这三色按一定的比例混合出来，这三个独立的色称之为三原色（或三基色）。视网膜上确实有三种感光细胞，分别对三原色敏感。
• 补色，这种理论认为人对色彩的感知并不是直接编码三种感光细胞的响应，用三原色合成颜色，而是编码某两种原色的差异，所以叫补色原理，在神经生物学的研究中发现，由于三种感光细胞对光线的反应曲线是有重叠的，所以这种编码差异而不是完整的曲线的方式，信息量更小，通过大脑的处理即可还原所有的信息。

• 透视(Perspective)
• Occlusion(遮挡关系)
• Haze(远处的物体会由于雾和大气散射而变得模糊)
• Steropsis(立体视觉)
• Relative Motion(相对运动)

Chiaroscuro（明暗法）

## Vision and Art ,The Biology of Seeing 读书笔记(1)

• 视网膜色素上皮 (retinal pigment epithelium)
• 感光层 (photoreceptor layer,rod,cone) ，包括视杆和视锥细胞
• 神经节细胞层 (ganglion cell) ，这个层含有神经节细胞的细胞核，视神经从这里开始
• 内核层 (inner nuclear layer)，又称内颗粒层，由双极细胞(Bipolar Cell)、水平细胞(Horizontal Cell)、无长突细胞、Muller细胞的胞核组成
• 神经纤维层 (nerve fiber layer)，主要为神经节细胞的轴突

• 感光层居然在Retina的最外层，也就是最后才接收到光线的，也就是这个原因，在中心的Fovea区域，为了最大化分辨率，这里的三层神经节细胞都被移开了，露出了一小块区域，让光线经历的中间细胞少一些
• 中间的几层神经元细胞会在感光层接受光线刺激产生信号后，对信号做整合预处理
• 光感受体传递神经信号的方式和普通的神经并不相同，光感受体平时一直处于Potentiate的高电位状态，光线刺激出现的时候才降低电位
• 视锥细胞工作在比较亮的环境下，而且可以分辨颜色。视杆细胞工作在比较暗的环境下，其分辨率比较低，而且不能分辨颜色

Center/Surround原理

What and Where System

Update 2012.5.16，第二部分在这里