7 jaren geleden · 9fe0b735c6
--- a/content/blog/2016-01-29-twitter-bots-using-tweepy.md
+++ b/content/blog/2016-01-29-twitter-bots-using-tweepy.md
@@ -10,6 +10,8 @@ categories:
 ---
 Unable to think what to tweet about? Have you ever faced a similar situation?

 <img src="https://naldzgraphics.net/wp-content/uploads/2009/04/twi1.jpg">

 Well, it’s very easy to create your own bots using python’s Tweepy module. You can use these skeletons I recently made for a workshop on the same topic. All you need to make your own bot is add some logic to these skeletons.

 * * *
--- a/content/blog/2016-03-22-blip.md
+++ b/content/blog/2016-03-22-blip.md
@@ -18,7 +18,7 @@ tags:

 We were inspired by the The Time Machine (2002) movie&#8217;s scene where the protagonist enters a museum in the future.

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="465" height="360" src="https://www.youtube.com/embed/CQbkhYg2DzM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 During the hackathon we were able to make an app that relays RSSI values to our real time Database (rethink-db) that works on a pub-sub model, queries the real time database for its calculated position and receives contextual information relating to its predicted position inside the building where beacons have been set up.

@@ -26,7 +26,7 @@ During the hackathon we were able to make an app that relays RSSI values to our

 Since, the final submission deadline was extended, we were able to reach back our campus at night and shoot a demo video at our university&#8217;s library.

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="465" height="360" src="https://www.youtube.com/embed/8IrnY7-q16A" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 Finally, we were selected in the top 20 for the offline finals of IndiaHacks and went to Taj Vivanta, Bangalore. It was a nice experience where we got to improve our idea with the help of mentors that were available there. We tweaked the algorithm and the variables a bit for the demo room we made at the venue. We were surprised to be among the few student teams at the finale.

--- a/content/blog/2016-04-15-foodify-app-hacknsit-2016.md
+++ b/content/blog/2016-04-15-foodify-app-hacknsit-2016.md
@@ -29,7 +29,7 @@ Since we were a team of 4 composed of two python developers ([rhnvrm][2], [mrkar

 You can see the demo video here:

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/7d7u1zjTrcM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 &nbsp;

--- a/content/blog/2016-05-07-adding-support-for-vector-instructions-to-8051-architecture.md
+++ b/content/blog/2016-05-07-adding-support-for-vector-instructions-to-8051-architecture.md
@@ -12,4 +12,4 @@ This was a group project for the Computer Architecture course at SNU under Prof.

 [View Fullscreen][1]

 [1]: /wp-content/plugins/pdfjs-viewer-shortcode/pdfjs/web/viewer.php?file=/wp-content/uploads/2016/12/8051_Vectorization.pdf&download=true&print=true&openfile=false
 [1]: /wp-content/uploads/2016/12/8051_Vectorization.pdf
--- a/content/blog/2016-08-06-topological-sort-for-problems-using-dag.md
+++ b/content/blog/2016-08-06-topological-sort-for-problems-using-dag.md
@@ -30,20 +30,18 @@ not have a topological sort. The proof for this can be found [here][1]

 Suppose we have the following graphs:

 <div class="language-python highlighter-rouge">
  <pre class="highlight"><code>&lt;span class="n">graph1&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span> &lt;span class="s">"x"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"y"&lt;/span>&lt;span class="p">],&lt;/span>
                &lt;span class="s">"z"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"y"&lt;/span>&lt;span class="p">],&lt;/span>
                &lt;span class="s">"y"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[],&lt;/span>
                &lt;span class="s">"a"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"b"&lt;/span>&lt;span class="p">],&lt;/span>
                &lt;span class="s">"b"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"c"&lt;/span>&lt;span class="p">],&lt;/span>
                &lt;span class="s">"c"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[]&lt;/span> &lt;span class="p">}&lt;/span>
 </code></pre>
 </div>

 <div class="language-python highlighter-rouge">
  <pre class="highlight"><code>&lt;span class="n">graph2&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span>&lt;span class="s">"x"&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"y"&lt;/span>&lt;span class="p">],&lt;/span> &lt;span class="s">"y"&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s">"x"&lt;/span>&lt;span class="p">]}&lt;/span>
 </code></pre>
 </div>
 ```python
 graph1 = { "x" : ["y"],
                "z" : ["y"],
                "y" : [],
                "a" : ["b"],
                "b" : ["c"],
                "c" : [] }

 Python

 graph2 = {"x" : ["y"], "y": ["x"]}
 ```

 Here, you can notice how <code class="highlighter-rouge">graph1</code> has a toposort but for <code class="highlighter-rouge">graph2</code>, it does not exist. This is because of the fact there
  
@@ -59,40 +57,39 @@ on calculating the indegree of all the vertices and using Queue (although it can

 Here is my implementation using Modified DFS and an array as a (kind-of) stack:

 <div class="language-python highlighter-rouge">
  <pre class="highlight"><code>&lt;span class="k">def&lt;/span> &lt;span class="nf">dfs_toposort&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">graph&lt;/span>&lt;span class="p">):&lt;/span>
    &lt;span class="n">L&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[]&lt;/span>
    &lt;span class="n">color&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span> &lt;span class="n">u&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="s">"white"&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">u&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">graph&lt;/span> &lt;span class="p">}&lt;/span>
    &lt;span class="n">found_cycle&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="bp">False&lt;/span>&lt;span class="p">]&lt;/span>
 ```python
 def dfs_toposort(graph):
    L = []
    color = { u : "white" for u in graph }
    found_cycle = [False]
    
    &lt;span class="k">for&lt;/span> &lt;span class="n">u&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">graph&lt;/span>&lt;span class="p">:&lt;/span>
        &lt;span class="k">if&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s">"white"&lt;/span>&lt;span class="p">:&lt;/span>
            &lt;span class="n">dfs_visit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">graph&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">u&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">L&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">)&lt;/span>
        &lt;span class="k">if&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]:&lt;/span>
            &lt;span class="k">break&lt;/span>
    for u in graph:
        if color[u] == "white":
            dfs_visit(graph, u, color, L, found_cycle)
        if found_cycle[0]:
            break
    
    &lt;span class="k">if&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]:&lt;/span>
        &lt;span class="n">L&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[]&lt;/span>
    if found_cycle[0]:
        L = []
    
    &lt;span class="n">L&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">reverse&lt;/span>&lt;span class="p">()&lt;/span>
    &lt;span class="k">return&lt;/span> &lt;span class="n">L&lt;/span>
    L.reverse()
    return L

 &lt;span class="k">def&lt;/span> &lt;span class="nf">dfs_visit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">graph&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">u&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">L&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">):&lt;/span>
    &lt;span class="k">if&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]:&lt;/span>
        &lt;span class="k">return&lt;/span>
    &lt;span class="n">color&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">"gray"&lt;/span>
 def dfs_visit(graph, u, color, L, found_cycle):
    if found_cycle[0]:
        return
    color[u] = "gray"
    
    &lt;span class="k">for&lt;/span> &lt;span class="n">v&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">graph&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">]:&lt;/span>
        &lt;span class="k">if&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s">"gray"&lt;/span>&lt;span class="p">:&lt;/span>
            &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">True&lt;/span>
            &lt;span class="k">return&lt;/span>
        &lt;span class="k">if&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">v&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s">"white"&lt;/span>&lt;span class="p">:&lt;/span>
            &lt;span class="n">dfs_visit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">graph&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">v&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">color&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">L&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">found_cycle&lt;/span>&lt;span class="p">)&lt;/span>
    for v in graph[u]:
        if color[v] == "gray":
            found_cycle[0] = True
            return
        if color[v] == "white":
            dfs_visit(graph, v, color, L, found_cycle)
    
    &lt;span class="n">color&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">"black"&lt;/span>
    &lt;span class="n">L&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">)&lt;/span>
 </code></pre>
 </div>
    color[u] = "black"
    L.append(u)
 ```

 The function <code class="highlighter-rouge">dfs_toposort</code> returns an empty array if there exists a cycle in the graph.

--- a/content/blog/2016-10-22-labeled-tweet-generator-and-galaxy-image-classifier-featured-in-sirajologys-youtube-videos.md
+++ b/content/blog/2016-10-22-labeled-tweet-generator-and-galaxy-image-classifier-featured-in-sirajologys-youtube-videos.md
@@ -16,20 +16,20 @@ The first project I made was a Galaxy Image Classifier (<https://github.com/rhnv

 It was based on this video:

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/QfNvhPx5Px8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 And it was featured in the next video in the series:

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/ZE7qWXX05T0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 The second project was a Labeled Tweet Dataset Generator (<https://github.com/rhnvrm/labeled-tweet-generator>). Using this project, a datascientist can open <https://twitter-sentiment-csv.herokuapp.com/> and type his query in the searchbox and look at the results and if he is happy with them he can click the download as csv button to save them and work on it.

 It was based on this video:

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/o_OZdbCzHUA" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 and was featured in this one:

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/9gBC9R-msAk" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 &nbsp;
--- a/content/blog/2016-11-01-todays-git-tip-in-gitconfig-url-gitgithub.md
+++ b/content/blog/2016-11-01-todays-git-tip-in-gitconfig-url-gitgithub.md
@@ -1,5 +1,5 @@
 ---
 title: 'Today’s git tip In gitconfig url  git@github…'
 title: 'Gitconfig tip for github'
 author: rhnvrm
 type: post
 date: 2016-11-01T09:45:37+00:00
--- a/content/blog/2016-11-02-i-wonder-what-linus-torvalds-view-is-about.md
+++ b/content/blog/2016-11-02-i-wonder-what-linus-torvalds-view-is-about.md
@@ -9,6 +9,7 @@ categories:
 tags:
  - git
 format: link
 draft: true

 ---
 I wonder what Linus Torvald&#8217;s view is about &#8220;Gitless&#8221;
--- a/content/blog/2016-11-07-a-tip-on-using-fsck-when-you-are.md
+++ b/content/blog/2016-11-07-a-tip-on-using-fsck-when-you-are.md
@@ -1,5 +1,5 @@
 ---
 title: A tip on using fsck when you are…
 title: A tip on using fsck
 author: rhnvrm
 type: post
 date: 2016-11-07T22:24:09+00:00
--- a/content/blog/2016-11-09-some-journal-publications-require-you-to-put-author.md
+++ b/content/blog/2016-11-09-some-journal-publications-require-you-to-put-author.md
@@ -1,5 +1,5 @@
 ---
 title: Some journal publications require you to put author…
 title: Author Biography Alongside Pictures in Latex
 author: rhnvrm
 type: post
 date: 2016-11-09T16:59:35+00:00
--- a/content/blog/2016-11-13-toured-seville-today-thanks-to-https-www-feelthecitytours.md
+++ b/content/blog/2016-11-13-toured-seville-today-thanks-to-https-www-feelthecitytours.md
@@ -1,5 +1,5 @@
 ---
 title: Toured Seville today thanks to https www feelthecitytours…
 title: Toured Seville today thanks to FeelTheCityTours
 author: rhnvrm
 type: post
 date: 2016-11-13T22:12:21+00:00
--- a/content/blog/2016-11-25-i-recently-corrupted-my-zsh-history-and-was.md
+++ b/content/blog/2016-11-25-i-recently-corrupted-my-zsh-history-and-was.md
@@ -1,5 +1,5 @@
 ---
 title: I recently corrupted my zsh history and was…
 title: Fixing my zsh history
 author: rhnvrm
 type: post
 date: 2016-11-25T18:35:39+00:00
--- a/content/blog/2016-11-29-octoshark-hackathon.md
+++ b/content/blog/2016-11-29-octoshark-hackathon.md
@@ -29,11 +29,11 @@ The backend server of OctoShark on receiving a `GET` request on the `/create`

 ### Demo Video

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/YVKhtYZ9Cyo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 ### Presentation Video

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/hEPKsGkPefs" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 ### Future Work

--- a/content/blog/2016-12-12-sorting-out-my-todo-list-for-the-next.md
+++ b/content/blog/2016-12-12-sorting-out-my-todo-list-for-the-next.md
@@ -9,6 +9,7 @@ categories:
 tags:
  - misc
 format: status
 draft: true

 ---
 Sorting out my todo list for the next 3 weeks.
--- a/content/blog/2017-01-02-.md
+++ b/content/blog/2017-01-02-.md
@@ -1,12 +0,0 @@
 ---
 title: Twenty Sixteen
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=126
 categories:
  - uncategorized
 format: status

 ---
--- a/content/blog/2017-01-06-.md
+++ b/content/blog/2017-01-06-.md
@@ -1,12 +0,0 @@
 ---
 title: Postmortem Week 1 – 2017
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=128
 categories:
  - uncategorized
 format: status

 ---
--- a/content/blog/2017-01-07-.md
+++ b/content/blog/2017-01-07-.md
@@ -1,12 +0,0 @@
 ---
 title: Tower of Hanoi
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=130
 categories:
  - uncategorized
 format: status

 ---
--- a/content/blog/2017-01-12-snu-data-limit.md
+++ b/content/blog/2017-01-12-snu-data-limit.md
@@ -2,9 +2,8 @@
 title: SNU Datalimit Chrome Extension
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=144
 date: 2017-07-22T23:43:41+00:00
 url: blog/2016/03/22/snu-data-limit
 categories:
  - projects
 tags:
@@ -15,6 +14,6 @@ tags:
 ---
 [<img class="aligncenter size-medium" src="https://github.com/rhnvrm/snu-data-limit/raw/master/screens/sample.png" width="640" height="400" />][1]

 The
 I developed a chrome extension to track usage. You can view the code on [GitHub](https://github.com/rhnvrm/snu-data-limit/)

 [1]: https://chrome.google.com/webstore/detail/snudatalimit/mfjinloagcpmfacpjnlabcflnkbajidd
--- a/content/blog/2017-02-04-i-used-to-use-the-l-flag.md
+++ b/content/blog/2017-02-04-i-used-to-use-the-l-flag.md
@@ -1,5 +1,5 @@
 ---
 title: I used to use the ` L` flag…
 title: SOCKS Proxy 
 author: rhnvrm
 type: post
 date: 2017-02-04T18:21:25+00:00
--- a/content/blog/2017-02-09-.md
+++ b/content/blog/2017-02-09-.md
@@ -1,13 +0,0 @@
 ---
 title: I’m taking a class on Psychoanalysis of Films…
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=177
 categories:
  - uncategorized
 format: status

 ---
 I&#8217;m taking a class on Psychoanalysis of Films. One of the tasks of the course is to make a 10 page screenplay by the end of the course. I recently read about Lacan&#8217;s interpretation of Freud&#8217;s Vorstellungsrepräsentanz.
--- a/content/blog/2017-02-09-vorstellungsreprasentanz.md
+++ b/content/blog/2017-02-09-vorstellungsreprasentanz.md
@@ -14,7 +14,7 @@ tags:
  - sociology

 ---
 <figure style="width: 230px" class="wp-caption alignright">[<img class="" src="https://upload.wikimedia.org/wikipedia/commons/9/99/Las_Meninas_01.jpg" width="230" height="265" />][1]<figcaption class="wp-caption-text">Las Meninas</figcaption></figure> 
 <figure style="width: 230px" class="wp-caption alignright"><img class="" src="https://upload.wikimedia.org/wikipedia/commons/9/99/Las_Meninas_01.jpg" width="230" height="265" /><figcaption class="wp-caption-text">Las Meninas[1]</figcaption></figure> 

 <span style="font-weight: 400;">In Lacan&#8217;s seminars, he discussed the artists Cézanne, Holbein and Velasquez. In each case the fil rouge which connected Lacan&#8217;s thought was the idea of shifts in perspective leading to ways in which the artist had produced a work that evoked the experience of the &#8220;gaze&#8221;. In Seminar XIII, in discussing Velasquez&#8217; Las Meninas, Lacan identifies the &#8220;picture within the picture&#8221; which we see Velasquez working on, as the Vorstellungsrepräsentanz , the representative of the representation. Lacan very clearly distinguished representation as being on the side of signification, whereas the &#8220;representative of representation&#8221; as being on the side of the signifier. In Las Meninas the &#8220;picture in the picture&#8221; is painted by Velasquez at the conjunction of two perspectives which are impossible in one space. Lacan said the &#8220;picture in the picture&#8221; as the &#8220;representative of representation&#8221; casts uncertainty on other &#8220;representations&#8221; in the painting. These other &#8220;objects&#8221; take on this disturbance of perspective in a domino effect, which allows many elements of the painting to take on this &#8220;representative of the representation&#8221; effect. This destabilizing of the visual space of the painting allows for displacements and condensations of images in the painting. An endless series of questions arise about the relations between the elements in the painting. People have talked about this painting for 350 years! What grounds the artist&#8217;s ability to do this is a masterful knowledge of his craft and an appreciation of a beyond of representation. With Las Meninas, it is Velasquez&#8217; ability to construct an impossible melding of perspectives that keep the viewer is suspense.</span>

@@ -22,7 +22,7 @@ tags:

 **An example of this in cinema is the ending of the movie 2001: A Space Odyssey by which was written and directed by Stanley Kubrick.** 

 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/AXS8P0HksQo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 <span style="font-weight: 400;">The trip through the wormhole takes our protagonist to a particularly ambiguous environment, adorned with luxurious furnishings but maintaining a clinical or rather detached, oddly misunderstood and superficial facsimile of luxury. Here Dave runs through his life, in fast forward until he dies and is reborn in the form of the &#8216;Star Child&#8217;. The cuts we see here have Dave observing himself in the third person, then we switch over to the other Dave and follow him. This device is an ingenious way that Kubrick elegantly side steps the use of the montage technique, simultaneously progressing time without resorting to fades, whilst furthering the artificiality of the environment (with) a deliberate manipulation of time.</span>

--- a/content/blog/2017-02-14-survey-paper-on-security-in-wireless-sensor-networks.md
+++ b/content/blog/2017-02-14-survey-paper-on-security-in-wireless-sensor-networks.md
@@ -17,4 +17,4 @@ Wireless Sensor Network is an emerging area that shows great future prospects. T

 [View Fullscreen][1]

 [1]: /wp-content/plugins/pdfjs-viewer-shortcode/pdfjs/web/viewer.php?file=http%3A%2F%2F13.232.63.7%2Fwp-content%2Fuploads%2F2017%2F07%2FTP_WSN2017_Group_15-1.pdf&download=true&print=true&openfile=false
 [1]: /wp-content/uploads/2017/07/TP_WSN2017_Group_15-1.pdf
--- a/content/blog/2017-04-20-retrofitting-led-lamps-into-smart-lamps.md
+++ b/content/blog/2017-04-20-retrofitting-led-lamps-into-smart-lamps.md
@@ -18,4 +18,4 @@ Objective of this project was to show as a proof of concept that we can pick up

 &nbsp;

 [1]: /wp-content/plugins/pdfjs-viewer-shortcode/pdfjs/web/viewer.php?file=http%3A%2F%2F13.232.63.7%2Fwp-content%2Fuploads%2F2017%2F07%2FWSN-Project-Report.pdf&download=true&print=true&openfile=false
 [1]: /wp-content/uploads/2017/07/WSN-Project-Report.pdf
--- a/content/blog/2017-05-20-.md
+++ b/content/blog/2017-05-20-.md
@@ -1,15 +0,0 @@
 ---
 title: "2016"
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=192
 categories:
  - uncategorized
 format: status

 ---
 2017 has been a dull year. It has felt even more dull after the fast paced year that 2016 was.

 &nbsp;
--- a/content/blog/2017-07-27-216.md
+++ b/content/blog/2017-07-27-216.md
@@ -1,8 +1,9 @@
 ---
 title: Death of Ivan Illyich
 author: rhnvrm
 type: post
 date: 2017-07-27T20:40:51+00:00
 url: blog/2017/07/27/216/
 url: blog/2017/07/27/death-of-ivn-illtich/
 categories:
  - uncategorized
 tags:
--- a/content/blog/2017-10-03-.md
+++ b/content/blog/2017-10-03-.md
@@ -1,12 +0,0 @@
 ---
 title: I’m moving back to Firefox
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=221
 categories:
  - uncategorized
 format: status

 ---
--- a/content/blog/2017-10-16-was-codification-of-odissi-successful-in-capturing-the-true-essence-of-the-dance-as-it-was-prevalent-or-even-as-it-was-performed-in-the-ancient.md
+++ b/content/blog/2017-10-16-was-codification-of-odissi-successful-in-capturing-the-true-essence-of-the-dance-as-it-was-prevalent-or-even-as-it-was-performed-in-the-ancient.md
@@ -10,10 +10,13 @@ tags:
  - odissi

 ---
 Was codification of Odissi successful in capturing the true essence of the dance as it was prevalent or even as it was performed in the ancient era?

 &nbsp;
 The transfer of knowledge required for the continued existence of any performance art requires intense and deliberate training from both the Guru and the Shishya. Through codification and written text, the need to rely on this tradition to study the art form decreases but the difficulty to master increases due to standardization. In my study of the readings by Anita Cherian and the Odissi Renaissance along with my understanding of linguistics and language theory I wish to answer the question of how the codification of Odissi Dance, a performance art, has resulted in the birth of a modern classical dance form, far from what was probably performed by the ancients. I will firstly give a background to the situation before independence to befit someone unbeknownst to the scenario. I will assert here that the true essence of the dance was lost and it was only after the revival of classical forms due to nationalistic planning that modern Odissi was born. I then look upon how the institutionalization of performance art in India was necessitated by the Sangeet Natak Akademi (SNA) and India’s cultural planning and the impact it had on the local art forms of the new nation state and how it lead to the codification of Odissi and argue that it was indeed the policy framework and patronage that pushed the Gotipuas to, in essence, codify and revive the Classical Odissi according to the Natyashastra. With the emergence of this totally new dance form, differing from both ancient and actually practiced forms during the times and evolving to the present day, I discuss that the codification, using a lens of linguistics, demonstrates how similar to spoken languages, dynamic art performances are not truly captured by codified grammar. 
 Indian Classical Dances, such as Odissi are disseminated to students via oral tradition and usually adhere to no written syllabus, other than actual toil, blood and sweat of the disciple with the Guru. Natyashastra, the ancient document on dance, music and drama mentions Odramagadhi style of dance. Through this it is concluded that Odissi dance did exist within a classical framework since 2000 years. The evidence was compounded by the sculptures of dance poses found in temples and archeological sites. The art was suppressed by the Islamic rule and the British rule that followed. The maharis, who were the temple dancers who held the knowledge of the original form stopped practicing the dance due to this suppression. The dance form was continued upon by the Gotipuas, who were boys, aged between 9-14 years dressed in drag. They continued the dance form in their own style. Hence, due to the lack of writing by the mahris there were not many written records about the dance in the recent era. Also, the original temple dance was lost and only survived by the archeological remnants and the Gotipuas. Here, we can see that since the original Gurus of these forms were lost there was no way to continue on the tradition and hence, in fact the essence was indeed lost. However, with the independence of India, a new wave of cultural revival spawned along with a passion for identity amongst the Aanchalis to assert their Odissi style of music and dance amongst the Indian Diaspora. 

 [View Fullscreen][1]

 [1]: /wp-content/plugins/pdfjs-viewer-shortcode/pdfjs/web/viewer.php?file=http%3A%2F%2F13.232.63.7%2Fwp-content%2Fuploads%2F2017%2F11%2Fdoc.pdf&download=true&print=true&openfile=false
 Institutionalization and Standardization of the arts by the SNA was a huge influence and motivation behind the codification of Odissi. As mentioned by Anita Cherian, the Theatre was indirectly controlled by the SNA which was influenced by the Government and its idea of culture and cultural unification. The theatre is where the middle class went and for any performer to showcase their art, it was clear that they would have to conform to the SNA’s ideals. Not only this, but the awards and scholarships were also directly controlled by the SNA. A clear example of this is in fact Odissi, which was not accepted to be classical enough until it was reformed with Natyashastra and the Abinaya Darpana by Guru Mayadhar Raut as mentioned in the Odissi Renaissance. The SNA had replaced the patronage of the royalty of India. And there was no way to not be in tandem with the SNA if you were an artiste in India. For example, it was mentioned in the lectures that there was no evidence for the usage of the now ubiquitously associated silver jewellery with Odissi in the ancient era. But, upon delving further, we find that one of the biggest patrons of Odissi was a Silversmith. Later, the SNA took upon this role of being the nourisher of the arts. Dhirendranath Patnaik comments that the state of Odissi was poor with poorly developed music, costumes and repertory. It is therefore, clear to see the reasons why the Jayantika Association composed of practicing gurus, dancers and scholars got together to rebuild the repertory of the form. The combined form, composed of various practiced forms were incorporated into the mutually agreed Jayantika Association codes, styles, and repertoire influenced by the sculpturesque poses along with the mudras. Only after this and a few performances by the troupes which were highly appraised, the Central Sangeet Natak Akademi accepted Odissi dance as a classical school of dance.
 The premise that the classical arts are not living arts has lead to individual performance pieces or choreographies that do not follow a certain style to be categorized as not Odissi enough. For example, the Ramli Ibrahim style of Odissi is often remarked to be so. People who study modern linguistics consider spoken language to be the true form of language. Spoken language is the primary language while written language is an imperfect reflection of spoken language, conveyed through an imperfect technology, that is writing. Spoken language comes naturally to all normal human children. Similar to language, Dance is also an expression of the human mind. Normal human children naturally develop rhythm and can perform basic movements to rhythms at an early age. There have been numerous cases where spoken languages, without a script have been codified, such as Korean. Considering a case familiar to us Indians, would be of  “Indian English” which is not yet codified or accepted to even exist. Although, considering the fact that according to linguistics, it is a real phenomenon. Here, the written language fails to capture the dynamic and changing language. Similar to grammar for languages, Odissi has been codified and is composed of motifs, movements and abhinaya. The codified structure serves as the grammar and helps the Gurus to make their choreographies. The sculpturesque poses, which are considered to be an essential quality in modern Odissi, were nowhere to be seen in the dances of the Gotipuas or the Maharis. The evidence for the dynamic nature of Odissi or say any other dance form is in the flourishing Odissi Paddhatis (or Gharanas). This system allows the disciples to work upon the work of their Guru and add to the style of their Gharana while at the same time being limited to the style and formalities of the Gharana. We can see that four flavors of Odissi prevail in the modern times. All of them slightly vary from what the Jayantika Association codified earlier and in actuality, Odissi exists as a dynamic art.


 As discussed above, I find that codification of Odissi has served the purpose for which it was done. Although, it is sufficient to look upon the arguments presented above to assert that classical Odissi is far from how it was practiced in the ancient era and moreover did not even capture how it was performed while being coded. The institutionalization of the arts in India played a major role and its influence was enhanced by the fact that the state was the only major patron, providing the stage, the awards and the recognition for talent. Emphasis should be placed on the fact that one can easily study and find that performance art is clearly dynamic and not static and is uniquely tied with the socio-political scenario at any given moment in time. Therefore, although the Odissi we know today is unlike what was practised years ago, we find that the codification has helped in adequate preservation and revival of the form.
--- a/content/blog/2017-11-30-emotive-adsense-project.md
+++ b/content/blog/2017-11-30-emotive-adsense-project.md
@@ -8,13 +8,12 @@ categories:
  - projects

 ---
 <span class="embed-youtube" style="text-align:center; display: block;"></span>

 ## Objective

 Use Facial Expressions to find segments of the video where engagement is above a threshold and display advertisements during those segments.

 &nbsp;
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/RnUbnOvWobI" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 ## Domain Background

--- a/content/blog/2017-12-19-what-thefuck-is-wrong-with.md
+++ b/content/blog/2017-12-19-what-thefuck-is-wrong-with.md
@@ -9,6 +9,7 @@ client-modified:
 categories:
  - uncategorized
 format: aside
 draft: true

 ---
 **What thefuck is wrong with my zsh?**
--- a/content/blog/2017-12-20-setting-up-latex-on-spacemacs.md
+++ b/content/blog/2017-12-20-setting-up-latex-on-spacemacs.md
@@ -25,9 +25,11 @@ Then, all that needs to be done is press, `SPC-m-b` to build and `SPC-m-v` to vi

 Although, by default, Emacs will open it in your default PDF viewer. Emacs also provides another layer, `pdf-tools`, briefly mentioned above, which allows rendering PDF files inside Emacs itself. Adding this layer to your config, you can add the following to your config file to set PDF tools to be your default PDF viewer inside Emacs.

 <pre class="brush: plain; title: ; notranslate" title="">(setq TeX-view-program-selection '((output-pdf "PDF Tools"))
 ```lisp
 (setq TeX-view-program-selection '((output-pdf "PDF Tools"))
  TeX-view-program-list '(("PDF Tools" TeX-pdf-tools-sync-view))
  TeX-source-correlate-start-server t
 )</pre>
 )
 ```

 Similarly, we can also setup syncing between TeX and the PDF which I will cover sometime later when the need arises.
--- a/content/blog/2017-12-21-deep-learning-through-the-lens-of-the-information-plane.md
+++ b/content/blog/2017-12-21-deep-learning-through-the-lens-of-the-information-plane.md
@@ -37,34 +37,34 @@ A Markov process is a &#8220;memory-less&#8221; (also called “Markov Property

 ###  **2.2 KL Divergence** 

 KL divergence measures how one probability distribution  <img src="//s0.wp.com/latex.php?latex=%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{p}" title="{p}" class="latex" />diverges from a second expected probability distribution <img src="//s0.wp.com/latex.php?latex=%7Bq%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{q}" title="{q}" class="latex" />. It is asymmetric. [5]
 KL divergence measures how one probability distribution  <img src="https://s0.wp.com/latex.php?latex=%7Bp%7D&bg=ffffff&fg=000000&s=0" alt="{p}" title="{p}" class="latex" />diverges from a second expected probability distribution <img src="https://s0.wp.com/latex.php?latex=%7Bq%7D&bg=ffffff&fg=000000&s=0" alt="{q}" title="{q}" class="latex" />. It is asymmetric. [5]

 <img src="//s0.wp.com/latex.php?latex=D_%7BKL%7D%28p+%5C%7C+q%29+%3D+%5Csum_x+p%28x%29+%5Clog+%5Cfrac%7Bp%28x%29%7D%7Bq%28x%29%7D+dx++%3D+-+%5Csum_x+p%28x%29%5Clog+q%28x%29+%2B+%5Csum_x+p%28x%29%5Clog+p%28x%29++%3D+H%28P%2C+Q%29+-+H%28P%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="D_{KL}(p &#92;| q) = &#92;sum_x p(x) &#92;log &#92;frac{p(x)}{q(x)} dx  = - &#92;sum_x p(x)&#92;log q(x) + &#92;sum_x p(x)&#92;log p(x)  = H(P, Q) - H(P)  " title="D_{KL}(p &#92;| q) = &#92;sum_x p(x) &#92;log &#92;frac{p(x)}{q(x)} dx  = - &#92;sum_x p(x)&#92;log q(x) + &#92;sum_x p(x)&#92;log p(x)  = H(P, Q) - H(P)  " class="latex" />
 <img src="https://s0.wp.com/latex.php?latex=D_%7BKL%7D%28p+%5C%7C+q%29+%3D+%5Csum_x+p%28x%29+%5Clog+%5Cfrac%7Bp%28x%29%7D%7Bq%28x%29%7D+dx++%3D+-+%5Csum_x+p%28x%29%5Clog+q%28x%29+%2B+%5Csum_x+p%28x%29%5Clog+p%28x%29++%3D+H%28P%2C+Q%29+-+H%28P%29++&bg=ffffff&fg=000&s=0" alt="D_{KL}(p &#92;| q) = &#92;sum_x p(x) &#92;log &#92;frac{p(x)}{q(x)} dx  = - &#92;sum_x p(x)&#92;log q(x) + &#92;sum_x p(x)&#92;log p(x)  = H(P, Q) - H(P)  " title="D_{KL}(p &#92;| q) = &#92;sum_x p(x) &#92;log &#92;frac{p(x)}{q(x)} dx  = - &#92;sum_x p(x)&#92;log q(x) + &#92;sum_x p(x)&#92;log p(x)  = H(P, Q) - H(P)  " class="latex" />

 <img src="//s0.wp.com/latex.php?latex=%7BD_%7BKL%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{D_{KL}}" title="{D_{KL}}" class="latex" />achieves the minimum zero when  <img src="//s0.wp.com/latex.php?latex=%7Bp%28x%29+%3D%3D+q%28x%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{p(x) == q(x)}" title="{p(x) == q(x)}" class="latex" />everywhere.
 <img src="https://s0.wp.com/latex.php?latex=%7BD_%7BKL%7D%7D&bg=ffffff&fg=000000&s=0" alt="{D_{KL}}" title="{D_{KL}}" class="latex" />achieves the minimum zero when  <img src="https://s0.wp.com/latex.php?latex=%7Bp%28x%29+%3D%3D+q%28x%29%7D&bg=ffffff&fg=000000&s=0" alt="{p(x) == q(x)}" title="{p(x) == q(x)}" class="latex" />everywhere.

 ###  **2.3 Mutual Information** 

 Mutual information measures the mutual dependence between two variables. It quantifies the &#8220;amount of information&#8221; obtained about one random variable through the other random variable. Mutual information is symmetric. [5]

 <img src="//s0.wp.com/latex.php?latex=I%28X%3BY%29+%3D+D_%7BKL%7D%5Cleft%5B%7Ep%28x%2Cy%29+%7E%5C%7C%7E+p%28x%29p%28y%29%7E%5Cright%5D++%3D+%5Csum_%7Bx+%5Cin+X%2C+y+%5Cin+Y%7D+p%28x%2C+y%29+%5Clog%5Cleft%28%5Cfrac%7Bp%28x%2C+y%29%7D%7Bp%28x%29p%28y%29%7D%5Cright%29++%3D+%5Csum_%7Bx+%5Cin+X%2C+y+%5Cin+Y%7D+p%28x%2C+y%29+%5Clog%5Cleft%28%5Cfrac%7Bp%28x%7Cy%29%7D%7Bp%28x%29%7D%5Cright%29++%3D+H%28X%29+-+H%28X%7CY%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="I(X;Y) = D_{KL}&#92;left[~p(x,y) ~&#92;|~ p(x)p(y)~&#92;right]  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x, y)}{p(x)p(y)}&#92;right)  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x|y)}{p(x)}&#92;right)  = H(X) - H(X|Y)  " title="I(X;Y) = D_{KL}&#92;left[~p(x,y) ~&#92;|~ p(x)p(y)~&#92;right]  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x, y)}{p(x)p(y)}&#92;right)  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x|y)}{p(x)}&#92;right)  = H(X) - H(X|Y)  " class="latex" />
 <img src="https://s0.wp.com/latex.php?latex=I%28X%3BY%29+%3D+D_%7BKL%7D%5Cleft%5B%7Ep%28x%2Cy%29+%7E%5C%7C%7E+p%28x%29p%28y%29%7E%5Cright%5D++%3D+%5Csum_%7Bx+%5Cin+X%2C+y+%5Cin+Y%7D+p%28x%2C+y%29+%5Clog%5Cleft%28%5Cfrac%7Bp%28x%2C+y%29%7D%7Bp%28x%29p%28y%29%7D%5Cright%29++%3D+%5Csum_%7Bx+%5Cin+X%2C+y+%5Cin+Y%7D+p%28x%2C+y%29+%5Clog%5Cleft%28%5Cfrac%7Bp%28x%7Cy%29%7D%7Bp%28x%29%7D%5Cright%29++%3D+H%28X%29+-+H%28X%7CY%29++&bg=ffffff&fg=000&s=0" alt="I(X;Y) = D_{KL}&#92;left[~p(x,y) ~&#92;|~ p(x)p(y)~&#92;right]  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x, y)}{p(x)p(y)}&#92;right)  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x|y)}{p(x)}&#92;right)  = H(X) - H(X|Y)  " title="I(X;Y) = D_{KL}&#92;left[~p(x,y) ~&#92;|~ p(x)p(y)~&#92;right]  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x, y)}{p(x)p(y)}&#92;right)  = &#92;sum_{x &#92;in X, y &#92;in Y} p(x, y) &#92;log&#92;left(&#92;frac{p(x|y)}{p(x)}&#92;right)  = H(X) - H(X|Y)  " class="latex" />

 ###  **2.4 Data Processing Inequality** 

 For any markov chain: <img src="//s0.wp.com/latex.php?latex=%7BX+%5Crightarrow+Y+%5Crightarrow+Z%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X &#92;rightarrow Y &#92;rightarrow Z}" title="{X &#92;rightarrow Y &#92;rightarrow Z}" class="latex" />, we would have [5]
 For any markov chain: <img src="https://s0.wp.com/latex.php?latex=%7BX+%5Crightarrow+Y+%5Crightarrow+Z%7D&bg=ffffff&fg=000000&s=0" alt="{X &#92;rightarrow Y &#92;rightarrow Z}" title="{X &#92;rightarrow Y &#92;rightarrow Z}" class="latex" />, we would have [5]

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%3B+Y%29+%5Cgeq+I%28X%3B+Z%29+%5C+%5C+%5C+%5C+%5C+%281%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle I(X; Y) &#92;geq I(X; Z) &#92; &#92; &#92; &#92; &#92; (1)" title="&#92;displaystyle I(X; Y) &#92;geq I(X; Z) &#92; &#92; &#92; &#92; &#92; (1)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%3B+Y%29+%5Cgeq+I%28X%3B+Z%29+%5C+%5C+%5C+%5C+%5C+%281%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle I(X; Y) &#92;geq I(X; Z) &#92; &#92; &#92; &#92; &#92; (1)" title="&#92;displaystyle I(X; Y) &#92;geq I(X; Z) &#92; &#92; &#92; &#92; &#92; (1)" class="latex" />
 </p>

 A deep neural network can be viewed as a Markov chain, and thus when we are moving down the layers of a DNN, the mutual information between the layer and the input can only decrease.

 ###  **2.5 Reparameterization Invariance** 

 For two invertible functions <img src="//s0.wp.com/latex.php?latex=%7B%5Cphi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;phi}" title="{&#92;phi}" class="latex" />, <img src="//s0.wp.com/latex.php?latex=%7B%5Cpsi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;psi}" title="{&#92;psi}" class="latex" />, the mutual information still holds: <a name="RepInv"></a>
 For two invertible functions <img src="https://s0.wp.com/latex.php?latex=%7B%5Cphi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;phi}" title="{&#92;phi}" class="latex" />, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cpsi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;psi}" title="{&#92;psi}" class="latex" />, the mutual information still holds: <a name="RepInv"></a>

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%3B+Y%29+%3D+I%28%5Cphi%28X%29%3B+%5Cpsi%28Y%29%29+%5C+%5C+%5C+%5C+%5C+%282%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle I(X; Y) = I(&#92;phi(X); &#92;psi(Y)) &#92; &#92; &#92; &#92; &#92; (2)" title="&#92;displaystyle I(X; Y) = I(&#92;phi(X); &#92;psi(Y)) &#92; &#92; &#92; &#92; &#92; (2)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%3B+Y%29+%3D+I%28%5Cphi%28X%29%3B+%5Cpsi%28Y%29%29+%5C+%5C+%5C+%5C+%5C+%282%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle I(X; Y) = I(&#92;phi(X); &#92;psi(Y)) &#92; &#92; &#92; &#92; &#92; (2)" title="&#92;displaystyle I(X; Y) = I(&#92;phi(X); &#92;psi(Y)) &#92; &#92; &#92; &#92; &#92; (2)" class="latex" />
 </p>

 &nbsp;
@@ -73,30 +73,30 @@ For example, if we shuffle the weights in one layer of DNN, it would not affect

 ###  **2.6 The Asymptotic Equipartition Property** 

 This theorem is a simple consequence of the weak law of large numbers. It states that if a set of values  <img src="//s0.wp.com/latex.php?latex=%7BX_1%2C+X_2%2C+...%2C+X_n%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X_1, X_2, ..., X_n}" title="{X_1, X_2, ..., X_n}" class="latex" />is drawn independently from a random variable X distributed according to <img src="//s0.wp.com/latex.php?latex=%7BP%28x%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(x)}" title="{P(x)}" class="latex" />, then the joint probability  <img src="//s0.wp.com/latex.php?latex=%7BP%28X_1%2C...%2CX_n%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(X_1,...,X_n)}" title="{P(X_1,...,X_n)}" class="latex" />satisfies [5]
 This theorem is a simple consequence of the weak law of large numbers. It states that if a set of values  <img src="https://s0.wp.com/latex.php?latex=%7BX_1%2C+X_2%2C+...%2C+X_n%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X_1, X_2, ..., X_n}" title="{X_1, X_2, ..., X_n}" class="latex" />is drawn independently from a random variable X distributed according to <img src="https://s0.wp.com/latex.php?latex=%7BP%28x%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(x)}" title="{P(x)}" class="latex" />, then the joint probability  <img src="https://s0.wp.com/latex.php?latex=%7BP%28X_1%2C...%2CX_n%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(X_1,...,X_n)}" title="{P(X_1,...,X_n)}" class="latex" />satisfies [5]

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cfrac%7B-1%7D%7Bn%7D+%5Clog_%7B2%7D%7BP%28X_1%2CX_2%2C...%2CX_n%29%7D+%5Crightarrow+H%28X%29+%5C+%5C+%5C+%5C+%5C+%283%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;frac{-1}{n} &#92;log_{2}{P(X_1,X_2,...,X_n)} &#92;rightarrow H(X) &#92; &#92; &#92; &#92; &#92; (3)" title="&#92;displaystyle &#92;frac{-1}{n} &#92;log_{2}{P(X_1,X_2,...,X_n)} &#92;rightarrow H(X) &#92; &#92; &#92; &#92; &#92; (3)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cfrac%7B-1%7D%7Bn%7D+%5Clog_%7B2%7D%7BP%28X_1%2CX_2%2C...%2CX_n%29%7D+%5Crightarrow+H%28X%29+%5C+%5C+%5C+%5C+%5C+%283%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;frac{-1}{n} &#92;log_{2}{P(X_1,X_2,...,X_n)} &#92;rightarrow H(X) &#92; &#92; &#92; &#92; &#92; (3)" title="&#92;displaystyle &#92;frac{-1}{n} &#92;log_{2}{P(X_1,X_2,...,X_n)} &#92;rightarrow H(X) &#92; &#92; &#92; &#92; &#92; (3)" class="latex" />
 </p>

 where  <img src="//s0.wp.com/latex.php?latex=%7BH%28X%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{H(X)}" title="{H(X)}" class="latex" />is the entropy of the random variable <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />.
 where  <img src="https://s0.wp.com/latex.php?latex=%7BH%28X%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{H(X)}" title="{H(X)}" class="latex" />is the entropy of the random variable <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />.

 Although, this is out of bounds of the scope of this work, for the sake of completeness I would like to mention how the authors of [2] use this to argue that for a typical hypothesis class the size of  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />is approximately <img src="//s0.wp.com/latex.php?latex=%7B2%5E%7BH%28X%29%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{2^{H(X)}}" title="{2^{H(X)}}" class="latex" />. Considering an <img src="//s0.wp.com/latex.php?latex=%7B%5Cepsilon%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;epsilon}" title="{&#92;epsilon}" class="latex" />-partition, <img src="//s0.wp.com/latex.php?latex=%7BT_%5Cepsilon%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T_&#92;epsilon}" title="{T_&#92;epsilon}" class="latex" />, on <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, the cardinality of the hypothis class, <img src="//s0.wp.com/latex.php?latex=%7B%7CH_%5Cepsilon%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{|H_&#92;epsilon|}" title="{|H_&#92;epsilon|}" class="latex" />, can be written as  <img src="//s0.wp.com/latex.php?latex=%7B%7CH_%5Cepsilon%7C+%5Csim+2%5E%7B%7CX%7C%7D+%5Crightarrow+2%5E%7B%7CT_%5Cepsilon%7C%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{|H_&#92;epsilon| &#92;sim 2^{|X|} &#92;rightarrow 2^{|T_&#92;epsilon|}}" title="{|H_&#92;epsilon| &#92;sim 2^{|X|} &#92;rightarrow 2^{|T_&#92;epsilon|}}" class="latex" />and therefore we have,
 Although, this is out of bounds of the scope of this work, for the sake of completeness I would like to mention how the authors of [2] use this to argue that for a typical hypothesis class the size of  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />is approximately <img src="https://s0.wp.com/latex.php?latex=%7B2%5E%7BH%28X%29%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{2^{H(X)}}" title="{2^{H(X)}}" class="latex" />. Considering an <img src="https://s0.wp.com/latex.php?latex=%7B%5Cepsilon%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;epsilon}" title="{&#92;epsilon}" class="latex" />-partition, <img src="https://s0.wp.com/latex.php?latex=%7BT_%5Cepsilon%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T_&#92;epsilon}" title="{T_&#92;epsilon}" class="latex" />, on <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, the cardinality of the hypothis class, <img src="https://s0.wp.com/latex.php?latex=%7B%7CH_%5Cepsilon%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{|H_&#92;epsilon|}" title="{|H_&#92;epsilon|}" class="latex" />, can be written as  <img src="https://s0.wp.com/latex.php?latex=%7B%7CH_%5Cepsilon%7C+%5Csim+2%5E%7B%7CX%7C%7D+%5Crightarrow+2%5E%7B%7CT_%5Cepsilon%7C%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{|H_&#92;epsilon| &#92;sim 2^{|X|} &#92;rightarrow 2^{|T_&#92;epsilon|}}" title="{|H_&#92;epsilon| &#92;sim 2^{|X|} &#92;rightarrow 2^{|T_&#92;epsilon|}}" class="latex" />and therefore we have,

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cvert+T_%5Cepsilon+%5Cvert+%5Csim+%5Cfrac%7B2%5E%7BH%28X%29%7D%7D%7B2%5E%7BH%28X+%5Cvert+T_%5Cepsilon%29%7D%7D+%3D+2%5E%7BI%28T_%5Cepsilon%3B+X%29%7D+%5C+%5C+%5C+%5C+%5C+%284%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;vert T_&#92;epsilon &#92;vert &#92;sim &#92;frac{2^{H(X)}}{2^{H(X &#92;vert T_&#92;epsilon)}} = 2^{I(T_&#92;epsilon; X)} &#92; &#92; &#92; &#92; &#92; (4)" title="&#92;displaystyle &#92;vert T_&#92;epsilon &#92;vert &#92;sim &#92;frac{2^{H(X)}}{2^{H(X &#92;vert T_&#92;epsilon)}} = 2^{I(T_&#92;epsilon; X)} &#92; &#92; &#92; &#92; &#92; (4)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cvert+T_%5Cepsilon+%5Cvert+%5Csim+%5Cfrac%7B2%5E%7BH%28X%29%7D%7D%7B2%5E%7BH%28X+%5Cvert+T_%5Cepsilon%29%7D%7D+%3D+2%5E%7BI%28T_%5Cepsilon%3B+X%29%7D+%5C+%5C+%5C+%5C+%5C+%284%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;vert T_&#92;epsilon &#92;vert &#92;sim &#92;frac{2^{H(X)}}{2^{H(X &#92;vert T_&#92;epsilon)}} = 2^{I(T_&#92;epsilon; X)} &#92; &#92; &#92; &#92; &#92; (4)" title="&#92;displaystyle &#92;vert T_&#92;epsilon &#92;vert &#92;sim &#92;frac{2^{H(X)}}{2^{H(X &#92;vert T_&#92;epsilon)}} = 2^{I(T_&#92;epsilon; X)} &#92; &#92; &#92; &#92; &#92; (4)" class="latex" />
 </p>

 Then the input compression bound,

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cepsilon%5E2+%3C+%5Cfrac%7B%5Clog%7CH_%5Cepsilon%7C+%2B+%5Clog%7B1%2F%5Cdelta%7D%7D%7B2m%7D+%5C+%5C+%5C+%5C+%5C+%285%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;epsilon^2 < &#92;frac{&#92;log|H_&#92;epsilon| + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (5)" title="&#92;displaystyle &#92;epsilon^2 < &#92;frac{&#92;log|H_&#92;epsilon| + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (5)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cepsilon%5E2+%3C+%5Cfrac%7B%5Clog%7CH_%5Cepsilon%7C+%2B+%5Clog%7B1%2F%5Cdelta%7D%7D%7B2m%7D+%5C+%5C+%5C+%5C+%5C+%285%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;epsilon^2 < &#92;frac{&#92;log|H_&#92;epsilon| + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (5)" title="&#92;displaystyle &#92;epsilon^2 < &#92;frac{&#92;log|H_&#92;epsilon| + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (5)" class="latex" />
 </p>

 becomes,

 <p align="center">
  <img src="//s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cepsilon%5E2+%3C+%5Cfrac%7B2%5E%7BI%28T_%5Cepsilon%3B+X%29%7D+%2B+%5Clog%7B1%2F%5Cdelta%7D%7D%7B2m%7D+%5C+%5C+%5C+%5C+%5C+%286%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;epsilon^2 < &#92;frac{2^{I(T_&#92;epsilon; X)} + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (6)" title="&#92;displaystyle &#92;epsilon^2 < &#92;frac{2^{I(T_&#92;epsilon; X)} + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (6)" class="latex" />
  <img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cepsilon%5E2+%3C+%5Cfrac%7B2%5E%7BI%28T_%5Cepsilon%3B+X%29%7D+%2B+%5Clog%7B1%2F%5Cdelta%7D%7D%7B2m%7D+%5C+%5C+%5C+%5C+%5C+%286%29&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="&#92;displaystyle &#92;epsilon^2 < &#92;frac{2^{I(T_&#92;epsilon; X)} + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (6)" title="&#92;displaystyle &#92;epsilon^2 < &#92;frac{2^{I(T_&#92;epsilon; X)} + &#92;log{1/&#92;delta}}{2m} &#92; &#92; &#92; &#92; &#92; (6)" class="latex" />
 </p>

 The authors then further develop this to provide a general bound on learning by combining it with the Information Bottleneck theory [6].
@@ -105,33 +105,46 @@ The authors then further develop this to provide a general bound on learning by

 ### **3.1 DNN Layers as Markov Chain** 

 In supervised learning, the training data contains sampled observations from the joint distribution of  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />. The input variable  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and weights of hidden layers are all high-dimensional random variable. The ground truth target  <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />and the predicted value  <img src="//s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />are random variables of smaller dimensions in the classification settings. Moreover, we want to efficiently learn such representations from an empirical sample of the (unknown) joint distribution <img src="//s0.wp.com/latex.php?latex=%7BP%28X%2CY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(X,Y)}" title="{P(X,Y)}" class="latex" />, in a way that provides good generalization.
 In supervised learning, the training data contains sampled observations from the joint distribution of  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />. The input variable  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and weights of hidden layers are all high-dimensional random variable. The ground truth target  <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />and the predicted value  <img src="https://s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />are random variables of smaller dimensions in the classification settings. Moreover, we want to efficiently learn such representations from an empirical sample of the (unknown) joint distribution <img src="https://s0.wp.com/latex.php?latex=%7BP%28X%2CY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(X,Y)}" title="{P(X,Y)}" class="latex" />, in a way that provides good generalization.

 <figure id="attachment_302" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-302 size-large" src="/wp-content/uploads/2017/12/fig1-700x442.png" alt="" width="700" height="442" srcset="/wp-content/uploads/2017/12/fig1-700x442.png 700w, /wp-content/uploads/2017/12/fig1-300x189.png 300w, /wp-content/uploads/2017/12/fig1-768x485.png 768w, /wp-content/uploads/2017/12/fig1.png 1382w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The structure of a deep neural network, which consists of the target label <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />, input layer <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, hidden layers  <img src="//s0.wp.com/latex.php?latex=%7Bh_1%2C%5Cdots%2Ch_m%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_1,&#92;dots,h_m}" title="{h_1,&#92;dots,h_m}" class="latex" />and the final prediction <img src="//s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. (Image Source: Tishby 2015)[3]</figcaption></figure>If we label the hidden layers of a DNN as  <img src="//s0.wp.com/latex.php?latex=%7Bh_1%2Ch_2%2C...%2Ch_m%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_1,h_2,...,h_m}" title="{h_1,h_2,...,h_m}" class="latex" />as in Figure above, we can view each layer as one state of a Markov Chain: <img src="//s0.wp.com/latex.php?latex=%7Bh_i+%5Crightarrow+h_%7Bi%2B1%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_i &#92;rightarrow h_{i+1}}" title="{h_i &#92;rightarrow h_{i+1}}" class="latex" />. According to DPI, we would have:
 <figure id="attachment_302" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-302 size-large" src="/wp-content/uploads/2017/12/fig1-700x442.png" alt="" width="700" height="442" srcset="/wp-content/uploads/2017/12/fig1-700x442.png 700w, /wp-content/uploads/2017/12/fig1-300x189.png 300w, /wp-content/uploads/2017/12/fig1-768x485.png 768w, /wp-content/uploads/2017/12/fig1.png 1382w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The structure of a deep neural network, which consists of the target label <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />, input layer <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, hidden layers  <img src="https://s0.wp.com/latex.php?latex=%7Bh_1%2C%5Cdots%2Ch_m%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_1,&#92;dots,h_m}" title="{h_1,&#92;dots,h_m}" class="latex" />and the final prediction <img src="https://s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. (Image Source: Tishby 2015)[3]</figcaption></figure>

 <img src="//s0.wp.com/latex.php?latex=H%28X%29+%5Cgeq+I%28X%3B+h_1%29+%5Cgeq+I%28X%3B+h_2%29+%5Cgeq+...+%5Cgeq+I%28X%3B+h_m%29+%5Cgeq+I%28X%3B+%5Chat%7BY%7D%29++I%28X%3B+Y%29+%5Cgeq+I%28h_1%3B+Y%29+%5Cgeq+I%28h_2%3B+Y%29+%5Cgeq+...+%5Cgeq+I%28h_m%3B+Y%29+%5Cgeq+I%28%5Chat%7BY%7D%3B+Y%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="H(X) &#92;geq I(X; h_1) &#92;geq I(X; h_2) &#92;geq ... &#92;geq I(X; h_m) &#92;geq I(X; &#92;hat{Y})  I(X; Y) &#92;geq I(h_1; Y) &#92;geq I(h_2; Y) &#92;geq ... &#92;geq I(h_m; Y) &#92;geq I(&#92;hat{Y}; Y)  " title="H(X) &#92;geq I(X; h_1) &#92;geq I(X; h_2) &#92;geq ... &#92;geq I(X; h_m) &#92;geq I(X; &#92;hat{Y})  I(X; Y) &#92;geq I(h_1; Y) &#92;geq I(h_2; Y) &#92;geq ... &#92;geq I(h_m; Y) &#92;geq I(&#92;hat{Y}; Y)  " class="latex" />

 A DNN is designed to learn how to describe  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />to predict  <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />and eventually, to compress  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />to only hold the information related to <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />. Tishby describes this processing as &#8220;successive refinement of relevant information&#8221; [3].
 If we label the hidden layers of a DNN as  <img src="https://s0.wp.com/latex.php?latex=%7Bh_1%2Ch_2%2C...%2Ch_m%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_1,h_2,...,h_m}" title="{h_1,h_2,...,h_m}" class="latex" />as in Figure above, we can view each layer as one state of a Markov Chain: <img src="https://s0.wp.com/latex.php?latex=%7Bh_i+%5Crightarrow+h_%7Bi%2B1%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{h_i &#92;rightarrow h_{i+1}}" title="{h_i &#92;rightarrow h_{i+1}}" class="latex" />. 

 <figure id="attachment_303" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-303 size-large" src="/wp-content/uploads/2017/12/fig2-700x503.png" alt="" width="700" height="503" srcset="/wp-content/uploads/2017/12/fig2-700x503.png 700w, /wp-content/uploads/2017/12/fig2-300x216.png 300w, /wp-content/uploads/2017/12/fig2-768x552.png 768w, /wp-content/uploads/2017/12/fig2.png 869w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The DNN layers form a Markov chain of successive internal representations of the input layer <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />. (Image Source: Schwartz-Ziv and Tishby 2017 [2])</figcaption></figure>As long as these transformations on  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />in  <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />about  <img src="//s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />preserve information, we don’t really care which individual neurons within the layers encode which features of the input. This can be captured by finding the mutual information of  <img src="//s0.wp.com/latex.php?latex=%7BT%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T}" title="{T}" class="latex" />with respect to  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and <img src="//s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. Schwartz-Ziv and Tishby (2017) treat the whole layer, <img src="//s0.wp.com/latex.php?latex=%7BT%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T}" title="{T}" class="latex" />, as a single random variable, charachterized by  <img src="//s0.wp.com/latex.php?latex=%7BP%28T%7CX%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(T|X)}" title="{P(T|X)}" class="latex" />and <img src="//s0.wp.com/latex.php?latex=%7BP%28Y%7CT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(Y|T)}" title="{P(Y|T)}" class="latex" />, the encoder and decoder distributions respectively, and use the Reparameterization Invariance given in [(2)][1] to argue that since layers related by invertible re-parameterization appear in the same point, each information path in the plane corresponds to many different DNN’s, with possibly very different architectures. [3]
 According to DPI, we would have:

 <img src="//s0.wp.com/latex.php?latex=I%28X%3B+Y%29+%5Cgeq+I%28T_1%3B+Y%29+%5Cgeq+I%28T_2%3B+Y%29+%5Cgeq+...+%5Cgeq+I%28T_k%3B+Y%29+%5Cgeq+I%28%5Chat%7BY%7D%3B+Y%29++H%28X%29+%5Cgeq+I%28X%3B+T_1%29+%5Cgeq+I%28X%3B+T_2%29+%5Cgeq+...+%5Cgeq+I%28X%3B+T_k%29+%5Cgeq+I%28X%3B+%5Chat%7BY%7D%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="I(X; Y) &#92;geq I(T_1; Y) &#92;geq I(T_2; Y) &#92;geq ... &#92;geq I(T_k; Y) &#92;geq I(&#92;hat{Y}; Y)  H(X) &#92;geq I(X; T_1) &#92;geq I(X; T_2) &#92;geq ... &#92;geq I(X; T_k) &#92;geq I(X; &#92;hat{Y})  " title="I(X; Y) &#92;geq I(T_1; Y) &#92;geq I(T_2; Y) &#92;geq ... &#92;geq I(T_k; Y) &#92;geq I(&#92;hat{Y}; Y)  H(X) &#92;geq I(X; T_1) &#92;geq I(X; T_2) &#92;geq ... &#92;geq I(X; T_k) &#92;geq I(X; &#92;hat{Y})  " class="latex" />
 <img src="https://s0.wp.com/latex.php?latex=H%28X%29+%5Cgeq+I%28X%3B+h_1%29+%5Cgeq+I%28X%3B+h_2%29+%5Cgeq+...+%5Cgeq+I%28X%3B+h_m%29+%5Cgeq+I%28X%3B+%5Chat%7BY%7D%29++I%28X%3B+Y%29+%5Cgeq+I%28h_1%3B+Y%29+%5Cgeq+I%28h_2%3B+Y%29+%5Cgeq+...+%5Cgeq+I%28h_m%3B+Y%29+%5Cgeq+I%28%5Chat%7BY%7D%3B+Y%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="H(X) &#92;geq I(X; h_1) &#92;geq I(X; h_2) &#92;geq ... &#92;geq I(X; h_m) &#92;geq I(X; &#92;hat{Y})  I(X; Y) &#92;geq I(h_1; Y) &#92;geq I(h_2; Y) &#92;geq ... &#92;geq I(h_m; Y) &#92;geq I(&#92;hat{Y}; Y)  " title="H(X) &#92;geq I(X; h_1) &#92;geq I(X; h_2) &#92;geq ... &#92;geq I(X; h_m) &#92;geq I(X; &#92;hat{Y})  I(X; Y) &#92;geq I(h_1; Y) &#92;geq I(h_2; Y) &#92;geq ... &#92;geq I(h_m; Y) &#92;geq I(&#92;hat{Y}; Y)  " class="latex" />

 This is to say that after training, when the trained network, the new input passes through the layers which form a Markov Chain, to the predicted output <img src="//s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. The information plane has been discussed further in Section [3][2].
 A DNN is designed to learn how to describe  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />to predict  <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />and eventually, to compress  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />to only hold the information related to <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />. Tishby describes this processing as &#8220;successive refinement of relevant information&#8221; [3].

 <figure id="attachment_303" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-303 size-large" src="/wp-content/uploads/2017/12/fig2-700x503.png" alt="" width="700" height="503" srcset="/wp-content/uploads/2017/12/fig2-700x503.png 700w, /wp-content/uploads/2017/12/fig2-300x216.png 300w, /wp-content/uploads/2017/12/fig2-768x552.png 768w, /wp-content/uploads/2017/12/fig2.png 869w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The DNN layers form a Markov chain of successive internal representations of the input layer <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />. (Image Source: Schwartz-Ziv and Tishby 2017 [2])</figcaption></figure>


 As long as these transformations on  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />in  <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />about  <img src="https://s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />preserve information, we don’t really care which individual neurons within the layers encode which features of the input. This can be captured by finding the mutual information of  <img src="https://s0.wp.com/latex.php?latex=%7BT%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T}" title="{T}" class="latex" />with respect to  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />and <img src="https://s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. Schwartz-Ziv and Tishby (2017) treat the whole layer, <img src="https://s0.wp.com/latex.php?latex=%7BT%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{T}" title="{T}" class="latex" />, as a single random variable, charachterized by  <img src="https://s0.wp.com/latex.php?latex=%7BP%28T%7CX%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(T|X)}" title="{P(T|X)}" class="latex" />and <img src="https://s0.wp.com/latex.php?latex=%7BP%28Y%7CT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{P(Y|T)}" title="{P(Y|T)}" class="latex" />, the encoder and decoder distributions respectively, and use the Reparameterization Invariance given in [(2)][1] to argue that since layers related by invertible re-parameterization appear in the same point, each information path in the plane corresponds to many different DNN’s, with possibly very different architectures. [3]

 <img src="https://s0.wp.com/latex.php?latex=I%28X%3B+Y%29+%5Cgeq+I%28T_1%3B+Y%29+%5Cgeq+I%28T_2%3B+Y%29+%5Cgeq+...+%5Cgeq+I%28T_k%3B+Y%29+%5Cgeq+I%28%5Chat%7BY%7D%3B+Y%29++H%28X%29+%5Cgeq+I%28X%3B+T_1%29+%5Cgeq+I%28X%3B+T_2%29+%5Cgeq+...+%5Cgeq+I%28X%3B+T_k%29+%5Cgeq+I%28X%3B+%5Chat%7BY%7D%29++&#038;bg=ffffff&#038;fg=000&#038;s=0" alt="I(X; Y) &#92;geq I(T_1; Y) &#92;geq I(T_2; Y) &#92;geq ... &#92;geq I(T_k; Y) &#92;geq I(&#92;hat{Y}; Y)  H(X) &#92;geq I(X; T_1) &#92;geq I(X; T_2) &#92;geq ... &#92;geq I(X; T_k) &#92;geq I(X; &#92;hat{Y})  " title="I(X; Y) &#92;geq I(T_1; Y) &#92;geq I(T_2; Y) &#92;geq ... &#92;geq I(T_k; Y) &#92;geq I(&#92;hat{Y}; Y)  H(X) &#92;geq I(X; T_1) &#92;geq I(X; T_2) &#92;geq ... &#92;geq I(X; T_k) &#92;geq I(X; &#92;hat{Y})  " class="latex" />

 This is to say that after training, when the trained network, the new input passes through the layers which form a Markov Chain, to the predicted output <img src="https://s0.wp.com/latex.php?latex=%7B%5Chat%7BY%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{&#92;hat{Y}}" title="{&#92;hat{Y}}" class="latex" />. The information plane has been discussed further in Section [3][2].

 ###  **3.2 The Information Plane** 

 <a name="ssecIP"></a>

 Using the representation in Fig. [3][3], the encoder and decoder distributions; the encoder can be seen as a representation of <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, while the decoder translates the information in the current layer to the target output <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />.
 Using the representation in Fig. [3][3], the encoder and decoder distributions; the encoder can be seen as a representation of <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />, while the decoder translates the information in the current layer to the target output <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />.

 The information can be interpreted and visualized as a plot between the encoder mutual information  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT_%7Bi%7D%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T_{i})}" title="{I(X;T_{i})}" class="latex" />and the decoder mutual information <img src="https://s0.wp.com/latex.php?latex=%7BI%28T_%7Bi%7D%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T_{i};Y)}" title="{I(T_{i};Y)}" class="latex" />;

 The information can be interpreted and visualized as a plot between the encoder mutual information  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT_%7Bi%7D%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T_{i})}" title="{I(X;T_{i})}" class="latex" />and the decoder mutual information <img src="//s0.wp.com/latex.php?latex=%7BI%28T_%7Bi%7D%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T_{i};Y)}" title="{I(T_{i};Y)}" class="latex" />;
 <figure id="attachment_304" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-304 size-large" src="/wp-content/uploads/2017/12/fig3-700x292.png" alt="" width="700" height="292" srcset="/wp-content/uploads/2017/12/fig3-700x292.png 700w, /wp-content/uploads/2017/12/fig3-300x125.png 300w, /wp-content/uploads/2017/12/fig3-768x320.png 768w, /wp-content/uploads/2017/12/fig3.png 1390w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The encoder vs decoder mutual information of DNN hidden layers of 50 experiments. Different layers are color-coded, with green being the layer right next to the input and the orange being the furthest. There are three snapshots, at the initial epoch, 400 epochs and 9000 epochs respectively. (Image source: Shwartz-Ziv and Tishby, 2017) [2])</figcaption></figure>

 <figure id="attachment_304" style="width: 700px" class="wp-caption aligncenter"><img class="wp-image-304 size-large" src="/wp-content/uploads/2017/12/fig3-700x292.png" alt="" width="700" height="292" srcset="/wp-content/uploads/2017/12/fig3-700x292.png 700w, /wp-content/uploads/2017/12/fig3-300x125.png 300w, /wp-content/uploads/2017/12/fig3-768x320.png 768w, /wp-content/uploads/2017/12/fig3.png 1390w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-caption-text">The encoder vs decoder mutual information of DNN hidden layers of 50 experiments. Different layers are color-coded, with green being the layer right next to the input and the orange being the furthest. There are three snapshots, at the initial epoch, 400 epochs and 9000 epochs respectively. (Image source: Shwartz-Ziv and Tishby, 2017) [2])</figcaption></figure>Each dot in Fig. [3][4]. marks the encoder/ decoder mutual information of one hidden layer of one network simulation (no regularization is applied; no weights decay, no dropout, etc.). They move up as expected because the knowledge about the true labels is increasing (accuracy increases). At the early stage, the hidden layers learn a lot about the input X, but later they start to compress to forget some information about the input. Tishby believes that &#8220;the most important part of learning is actually forgetting&#8221;. [7]

 Each dot in Fig. [3][4]. marks the encoder/ decoder mutual information of one hidden layer of one network simulation (no regularization is applied; no weights decay, no dropout, etc.). They move up as expected because the knowledge about the true labels is increasing (accuracy increases). At the early stage, the hidden layers learn a lot about the input X, but later they start to compress to forget some information about the input. Tishby believes that &#8220;the most important part of learning is actually forgetting&#8221;. [7]

 Early on the points shoot up and to the right, as the hidden layers learn to retain more mutual information both with the input and also as needed to predict the output. But after a while, a phase shift occurs, and points move more slowly up and to the left.

 <figure id="attachment_305" style="width: 761px" class="wp-caption aligncenter"><img class="wp-image-305 size-full" src="/wp-content/uploads/2017/12/fig4.png" alt="" width="761" height="289" srcset="/wp-content/uploads/2017/12/fig4.png 761w, /wp-content/uploads/2017/12/fig4-300x114.png 300w, /wp-content/uploads/2017/12/fig4-700x266.png 700w" sizes="(max-width: 761px) 100vw, 761px" /><figcaption class="wp-caption-text">The evolution of the layers with the training epochs in the information plane, for different training samples. On the left &#8211; 5% of the data, middle &#8211; 45% of the data, and right &#8211; 85% of the data. The colors indicate the number of training epochs with Stochastic Gradient Descent. (Image source: Shwartz-Ziv and Tishby, 2017) [2])</figcaption></figure>Schwartz-Ziv and Tishby name these two phases Empirical eRror Minimization (ERM) and the phase that follows as the Representation Compression Phase. Here the gradient means are much larger than their standard deviations, indicating small gradient stochasticity (high SNR). The increase in  <img src="//s0.wp.com/latex.php?latex=%7BI_Y%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I_Y}" title="{I_Y}" class="latex" />is what we expect to see from cross-entropy loss minimization. The second diffusion phase minimizes the mutual information  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT_i%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T_i)}" title="{I(X;T_i)}" class="latex" />– in other words, we’re discarding information in X that is irrelevant to the task at hand.
 <figure id="attachment_305" style="width: 761px" class="wp-caption aligncenter"><img class="wp-image-305 size-full" src="/wp-content/uploads/2017/12/fig4.png" alt="" width="761" height="289" srcset="/wp-content/uploads/2017/12/fig4.png 761w, /wp-content/uploads/2017/12/fig4-300x114.png 300w, /wp-content/uploads/2017/12/fig4-700x266.png 700w" sizes="(max-width: 761px) 100vw, 761px" /><figcaption class="wp-caption-text">The evolution of the layers with the training epochs in the information plane, for different training samples. On the left &#8211; 5% of the data, middle &#8211; 45% of the data, and right &#8211; 85% of the data. The colors indicate the number of training epochs with Stochastic Gradient Descent. (Image source: Shwartz-Ziv and Tishby, 2017) [2])</figcaption></figure>

 Schwartz-Ziv and Tishby name these two phases Empirical eRror Minimization (ERM) and the phase that follows as the Representation Compression Phase. Here the gradient means are much larger than their standard deviations, indicating small gradient stochasticity (high SNR). The increase in  <img src="https://s0.wp.com/latex.php?latex=%7BI_Y%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I_Y}" title="{I_Y}" class="latex" />is what we expect to see from cross-entropy loss minimization. The second diffusion phase minimizes the mutual information  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT_i%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T_i)}" title="{I(X;T_i)}" class="latex" />– in other words, we’re discarding information in X that is irrelevant to the task at hand.

 A consequence of this is pointed out by Schwartz-Ziv and Tishby indicating that there is a huge number of different networks with essentially optimal performance, and attempts to interpret single weights or even single neurons in such networks can be meaningless due to the randomised nature of the final weights of the DNN. [2]

@@ -145,54 +158,63 @@ Variations were made to the activation function to Rectified Linear Unit (ReLu)

 ###  **4.2. Results** <figure id="attachment_310" style="width: 640px" class="wp-caption aligncenter">

 <img class="wp-image-310 size-full" src="/wp-content/uploads/2017/12/tanh-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/tanh-1.png 640w, /wp-content/uploads/2017/12/tanh-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with tanh as activation function. The X-Axis represents training losses and the Y-Axis represents steps</figcaption></figure> <figure id="attachment_316" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-316 size-full" src="/wp-content/uploads/2017/12/tanh-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/tanh-2.png 640w, /wp-content/uploads/2017/12/tanh-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with tanh as activation function. The X-Axis represents  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="//s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> 
 <img class="wp-image-310 size-full" src="/wp-content/uploads/2017/12/tanh-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/tanh-1.png 640w, /wp-content/uploads/2017/12/tanh-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with tanh as activation function. The X-Axis represents training losses and the Y-Axis represents steps</figcaption></figure> <figure id="attachment_316" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-316 size-full" src="/wp-content/uploads/2017/12/tanh-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/tanh-2.png 640w, /wp-content/uploads/2017/12/tanh-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with tanh as activation function. The X-Axis represents  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="https://s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> 

 The results were plotted using the experimental setup and tanh as the activation function. It is important to note that it’s the lowest layer which appears in the top-right of this plot (maintains the most mutual information), and the top-most layer which appears in the bottom-left (has retained almost no mutual information before any training). So the information path being followed goes from the top-right corner to the bottom-left traveling down the slope.

 Early on the points shoot up and to the right, as the hidden layers learn to retain more mutual information both with the input and also as needed to predict the output. But after a while, a phase shift occurs, and points move more slowly up and to the left.<figure id="attachment_306" style="width: 640px" class="wp-caption aligncenter">

 <img class="wp-image-306 size-full" src="/wp-content/uploads/2017/12/relu-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/relu-1.png 640w, /wp-content/uploads/2017/12/relu-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with ReLu as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="//s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_307" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-307 size-full" src="/wp-content/uploads/2017/12/relu-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/relu-2.png 640w, /wp-content/uploads/2017/12/relu-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with ReLu as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="//s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_309" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-309 size-full" src="/wp-content/uploads/2017/12/sigmoid-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/sigmoid-2.png 640w, /wp-content/uploads/2017/12/sigmoid-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with Sigmoid as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="//s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_308" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-308 size-full" src="/wp-content/uploads/2017/12/sigmoid-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/sigmoid-1.png 640w, /wp-content/uploads/2017/12/sigmoid-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with Sigmoid as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="//s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="//s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> 
 <img class="wp-image-306 size-full" src="/wp-content/uploads/2017/12/relu-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/relu-1.png 640w, /wp-content/uploads/2017/12/relu-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with ReLu as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="https://s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_307" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-307 size-full" src="/wp-content/uploads/2017/12/relu-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/relu-2.png 640w, /wp-content/uploads/2017/12/relu-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with ReLu as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="https://s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_309" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-309 size-full" src="/wp-content/uploads/2017/12/sigmoid-2.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/sigmoid-2.png 640w, /wp-content/uploads/2017/12/sigmoid-2-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Information Plane observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with Sigmoid as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="https://s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> <figure id="attachment_308" style="width: 640px" class="wp-caption aligncenter"><img class="wp-image-308 size-full" src="/wp-content/uploads/2017/12/sigmoid-1.png" alt="" width="640" height="480" srcset="/wp-content/uploads/2017/12/sigmoid-1.png 640w, /wp-content/uploads/2017/12/sigmoid-1-300x225.png 300w" sizes="(max-width: 640px) 100vw, 640px" /><figcaption class="wp-caption-text">Loss Function observed with a network having layers of 12-10-7-5-4-3-2 widths when trained with Sigmoid as activation function. The X-Axis on the left represents training losses and the Y-Axis represents steps. The X-Axis represents for the figure on the right  <img src="https://s0.wp.com/latex.php?latex=%7BI%28X%3BT%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(X;T)}" title="{I(X;T)}" class="latex" />and the Y-Axis represents <img src="https://s0.wp.com/latex.php?latex=%7BI%28T%3BY%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{I(T;Y)}" title="{I(T;Y)}" class="latex" /></figcaption></figure> 

 ###  **4.3. Analysis** 

 The results of using the hyperbolic tan function (tanh) as the choice for activation function corresponds with results obtained by Schwartz-Ziv and Tishby (2017) [2]. Although, the same can&#8217;t be said about the results obtained when ReLu or Sigmoid function was used as the activation function. The network seems to stabilize much faster when trained with ReLu but does not show any of the charachteristics mentioned by Schwartz-Ziv and Tishby (2017) such as compression and diffusion in the information plane. This is in line with [4], although the authors have commented in the open review [4] that they have used other strategies for binning during MI calculation which give correct results. The compression and diffusion phases can be clearly seen in Fig. [4][5]. The corresponding plot of the loss function also shows that the DNN actually learned the input variable  <img src="//s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />with respect to the ground truth <img src="//s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />.
 The results of using the hyperbolic tan function (tanh) as the choice for activation function corresponds with results obtained by Schwartz-Ziv and Tishby (2017) [2]. Although, the same can&#8217;t be said about the results obtained when ReLu or Sigmoid function was used as the activation function. The network seems to stabilize much faster when trained with ReLu but does not show any of the charachteristics mentioned by Schwartz-Ziv and Tishby (2017) such as compression and diffusion in the information plane. This is in line with [4], although the authors have commented in the open review [4] that they have used other strategies for binning during MI calculation which give correct results. The compression and diffusion phases can be clearly seen in Fig. [4][5]. The corresponding plot of the loss function also shows that the DNN actually learned the input variable  <img src="https://s0.wp.com/latex.php?latex=%7BX%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{X}" title="{X}" class="latex" />with respect to the ground truth <img src="https://s0.wp.com/latex.php?latex=%7BY%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0" alt="{Y}" title="{Y}" class="latex" />.

 ## References

 [1] Y. LeCun, Y. Bengio, and G. E. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. [Online]. Available: <a href="http://sci-hub.tw/10.1038/nature14539" target="_blank" rel="noopener noreferrer">http://sci-hub.tw/10.1038/nature14539</a>
 <a name="1">1. </a>Y. LeCun, Y. Bengio, and G. E. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. [Online]. Available: <a href="http://sci-hub.tw/10.1038/nature14539" target="_blank" rel="noopener noreferrer">http://sci-hub.tw/10.1038/nature14539</a>
  
 [2] R. Shwartz-Ziv and N. Tishby, “Opening the black box of deep neural networks via information,” CoRR, vol. abs/1703.00810, 2017. [Online]. Available: http://arxiv.org/abs/1703.00810
 <a name="2">2. </a>R. Shwartz-Ziv and N. Tishby, “Opening the black box of deep neural networks via information,” CoRR, vol. abs/1703.00810, 2017. [Online]. Available: http://arxiv.org/abs/1703.00810
  
 [3] N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,” CoRR, vol. abs/1503.02406, 2015. [Online]. Available: http://arxiv.org/abs/1503.02406
 <a name="3">3. </a>N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,” CoRR, vol. abs/1503.02406, 2015. [Online]. Available: http://arxiv.org/abs/1503.02406
  
 [4] Anonymous, “On the information bottleneck theory of deep learning,” International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=ry WPG-A-
 <a name="4">4. </a>Anonymous, “On the information bottleneck theory of deep learning,” International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=ry WPG-A-
  
 [5] T. M. Cover and J. A. Thomas, Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, 2006.
 <a name="5">5. </a>T. M. Cover and J. A. Thomas, Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, 2006.
  
 [6] N. Tishby, F. C. N. Pereira, and W. Bialek, “The information bottleneck method,” CoRR, vol. physics/0004057, 2000. [Online]. Available: http://arxiv.org/abs/physics/0004057
 <a name="6">6. </a>N. Tishby, F. C. N. Pereira, and W. Bialek, “The information bottleneck method,” CoRR, vol. physics/0004057, 2000. [Online]. Available: http://arxiv.org/abs/physics/0004057
  
 [7] L.Weng. Anatomize deep learning with informa-tion theory. [Online]. Available: https://lilianweng.github.io/lillog/2017/09/28/anatomize-deep-learning-with-information-theory.html
 <a name="7">7. </a>L.Weng. Anatomize deep learning with informa-tion theory. [Online]. Available: https://lilianweng.github.io/lillog/2017/09/28/anatomize-deep-learning-with-information-theory.html
  
 [8] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software available from tensorflow.org. [Online]. Available: https://www.tensorflow.org/
 <a name="8">8. </a>M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software available from tensorflow.org. [Online]. Available: https://www.tensorflow.org/
  
 [9] E. Jones, T. Oliphant, P. Peterson et al., “SciPy: Open source scientific tools for Python,” 2001–, [Online; accessed ¡today¿]. [Online]. Available: http://www.scipy.org/
 <a name="9">9. </a>E. Jones, T. Oliphant, P. Peterson et al., “SciPy: Open source scientific tools for Python,” 2001–, [Online; accessed ¡today¿]. [Online]. Available: http://www.scipy.org/
  
 [10] S. Prabh. Prof. shashi prabh homepage. [Online]. Available: https://sites.google.com/a/snu.edu.in/shashi-prabh/home
 <a name="10">10. </a>S. Prabh. Prof. shashi prabh homepage. [Online]. Available: https://sites.google.com/a/snu.edu.in/shashi-prabh/home
  
 [11] N. Wolchover. New theory cracks open the black box of deep learning — quanta magazine. Quanta Magazine. [On-line]. Available: https://www.quantamagazine.org/new-theory-cracks-
 <a name="11">11. </a>N. Wolchover. New theory cracks open the black box of deep learning — quanta magazine. Quanta Magazine. [On-line]. Available: https://www.quantamagazine.org/new-theory-cracks-
  
 open-the-black-box-of-deep-learning-20170921/
  
 [12] Machine learning subreddit. [Online]. Available: https://www.reddit.com/r/MachineLearning/
 <a name="12">12. </a>Machine learning subreddit. [Online]. Available: https://www.reddit.com/r/MachineLearning/

 <span style="font-size: 8pt;"><em>This work has been undertaken in the Course Project component for the elective titled &#8220;Information Theory (Fall 2017)&#8221; [https://sites.google.com/a/snu.edu.in/shashi-prabh/teaching/information-theory-2017] at Shiv Nadar University under the guidance of Prof. Shashi Prabh</em></span>
 <span style="font-size: 8pt;">This work has been undertaken in the Course Project component for the elective titled &#8220;Information Theory (Fall 2017)&#8221; [https://sites.google.com/a/snu.edu.in/shashi-prabh/teaching/information-theory-2017] at Shiv Nadar University under the guidance of Prof. Shashi Prabh</span>

 &nbsp;

 &nbsp;

 [1]: #RepInv
 [2]: #ssecIP
 [3]: #encdec
 [4]: #infoplane
 [5]: #FigTanhIP
 [1]: #1
 [2]: #2
 [3]: #3
 [4]: #4
 [5]: #5
 [6]: #6
 [7]: #7
 [8]: #8
 [9]: #9
 [10]: #10
 [11]: #11
 [12]: #12

 
--- a/content/blog/2017-12-21-setting-up-python-on-spacemacs-and-using-pyenv-to-use-python3.md
+++ b/content/blog/2017-12-21-setting-up-python-on-spacemacs-and-using-pyenv-to-use-python3.md
@@ -20,9 +20,10 @@ Getting started with the setup along with the basic packages etc. was easy by ju

 The problem I mentioned above about the python versions came into my view when I ran a simple print function, which gave an error as I did not have any shebang on top of the file. This made me realize a potential problem in the future as Python development heavily depends upon virtual environments. Thankfully, the python layer had already added pyvenv and pyenv. Although, pyenv only listed one `system` version, and that too it was of python2. So to solve this, I ran the following:

 <pre class="brush: plain; title: ; notranslate" title="">pyenv virtualenv -p /usr/bin/python2 venv2
 ```bash
 pyenv virtualenv -p /usr/bin/python2 venv2
 pyenv virtualenv -p /usr/bin/python3 venv3
 </pre>
 ```

 &nbsp;

--- a/content/blog/2018-02-23-featured-on-googles-instagram-instagram.md
+++ b/content/blog/2018-02-23-featured-on-googles-instagram-instagram.md
--- a/content/blog/2018-03-18-extract-filenames-without-their-extensions.md
+++ b/content/blog/2018-03-18-extract-filenames-without-their-extensions.md
@@ -16,4 +16,4 @@ format: aside
 ---
 Extract filenames without their extensions and put it in the clipboard

 <pre class="brush: plain; title: ; notranslate" title="">ls -C | awk -F"." '{print $1}' | xclip -selection c</pre>
 ```ls -C | awk -F"." '{print $1}' | xclip -selection c```
--- a/content/blog/2018-05-11-genie-the-voice-enabled-coding-companion-winner-dell-intern-hackathon.md
+++ b/content/blog/2018-05-11-genie-the-voice-enabled-coding-companion-winner-dell-intern-hackathon.md
@@ -13,6 +13,6 @@ tags:
  - python

 ---
 <span class="embed-youtube" style="text-align:center; display: block;"></span>
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/vjo0oVPG_9c" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 Genie is a Voice Assistant made up of three agents who talk to you and help you automate software engineering tasks. Watch the video to understand what it can do for you.
--- a/content/blog/2018-06-07-emacs-starts-a-bit-slow.md
+++ b/content/blog/2018-06-07-emacs-starts-a-bit-slow.md
@@ -13,4 +13,4 @@ format: aside
 ---
 Emacs starts a bit slow but it can be started as a daemon

 <pre class="brush: plain; title: ; notranslate" title="">emacsclient -c -n -e '(switch-to-buffer nil)'</pre>
 `emacsclient -c -n -e '(switch-to-buffer nil)'`
--- a/content/blog/2018-07-30-functional-options-for-testing-without-mocks-in-golang.md
+++ b/content/blog/2018-07-30-functional-options-for-testing-without-mocks-in-golang.md
@@ -18,39 +18,39 @@ Usually, structs are created with Option structs which hold parameters which are

 Another way is to use Functional Options, for example

 <pre><code class="language-go">
    type Server struct {
    	logger *logrus.Logger // optional
    	store databaste.Store // required
    }
    
    type ServerOption func(Server) Server
    
    func WithLogger(logger *logrus.Logger) ServerOption {
    	return func(s Server) Server {
    		s.logger = logger
    		return s
    	}
    }
    
    func NewServer(store database.Store, options ...ServerOption) *Server {
    	s := Server{store: store}
    	for _, option := range options {
    		s = option(s)
    	}
    	return &s
    }
    
    func main() {
    	myServer := NewServer(myStore, WithLogger(myLogger))
    }
 </code></pre>
 ```go
 type Server struct {
  logger *logrus.Logger // optional
  store databaste.Store // required
 }

 type ServerOption func(Server) Server

 func WithLogger(logger *logrus.Logger) ServerOption {
  return func(s Server) Server {
    s.logger = logger
    return s
  }
 }

 func NewServer(store database.Store, options ...ServerOption) *Server {
  s := Server{store: store}
  for _, option := range options {
    s = option(s)
  }
  return &s
 }

 func main() {
  myServer := NewServer(myStore, WithLogger(myLogger))
 }
 ```

 In the above example, we can set the logger without having to depend on config structs and obfuscating the API.

 Now that we have potentially solved configuration issues, we can move on to testing. To avoid writing mock functions, we can inject a function that actually performs the request. This way, the default method will be to use the actual implementation but the test can inject a function which simply returns the data we want to check in a way that would be easier for us to test with.

 <pre><code class="language-go">
 ```go
 // app.go
 // WithRequestSender sets the RequestSender for MyStruct.
 func WithRequestSender(fn func([]byte, *MyStruct)) Option {
@@ -94,7 +94,7 @@ func TestMyStruct_save(t *testing.T) {
    })
  })
 }
 </code></pre>
 ```

 The above way, enables us to check data that might be coming to us in some convoluted way without ever having to write complicated unreadable code or having to modify much of the actual implementation.

--- a/content/blog/2018-09-25-whistle-project-winner-ethindia-2018-hackathon.md
+++ b/content/blog/2018-09-25-whistle-project-winner-ethindia-2018-hackathon.md
@@ -12,21 +12,15 @@ tags:
  - hackathon

 ---
 <figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-has-aspect-ratio wp-embed-aspect-4-3">

 <div class="wp-block-embed__wrapper">
  <span class="embed-youtube" style="text-align:center; display: block;"></span>
 </div><figcaption>Demo Video</figcaption></figure> 
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/-9jnaQEjC1Q" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 Recently I took part in EthIndia Hackathon that took place in Bengaluru. This time I was participating without a team after a long time and made a team on the day of the event. All three of us (Ronak, Ayush and I) had a different idea of what we should work on but we finally came to a consensus on an idea that I had got from my current workplace&#8217;s CTO (Kailash Nadh). He had discussed a problem statement where he wanted to distribute asset holding information of people who have demised to their family members. This is a common task called the Dead Mans Switch which has been covered in a lot of movies as well as various experimental ideas. This was a big problem to solve, not only in size but also in the number of question marks it raises. After a lot of discussion with various mentors from the Ethereum community we decided and implemented upon the following idea by reducing the scope (instead of covering all assets, stick to only sending videos through IPFS) and deciding to skip the big issues like (missed heartbeats)

 Whistle &#8211; A platform to empower Whistleblowers and those who live under constant fear of death. Using smart contracts and the NuCypher proxy re-encryption MockNet we store the re-encrypted ipfs hash of the recorded video on the smart contract which can be interacted with using our heartbeat function interface which resets the decryption timer to a future date. In case a heartbeat is missed, the contract triggers emails containing the decrypted ipfs hash containing the video which can be streamed by anyone else.

 The best part about the event was the mentorship which guided us throughout the duration of the hackathon. We learnt that any good product, needs a few use cases which it is trying to solve and it should solve those perfectly. Based on those lines, we did a bit of research and found a bit more about this issue. Recently, Latifa Al Maktoum, a woman belonging to the royal family of Dubai, ran away and came to India as she was being tortured and drugged. She released a video on youtube, where she tells her viewers that if they are watching this, she might already be dead!<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-has-aspect-ratio wp-embed-aspect-16-9">
 The best part about the event was the mentorship which guided us throughout the duration of the hackathon. We learnt that any good product, needs a few use cases which it is trying to solve and it should solve those perfectly. Based on those lines, we did a bit of research and found a bit more about this issue. Recently, Latifa Al Maktoum, a woman belonging to the royal family of Dubai, ran away and came to India as she was being tortured and drugged. She released a video on youtube, where she tells her viewers that if they are watching this, she might already be dead!

 <div class="wp-block-embed__wrapper">
  <span class="embed-youtube" style="text-align:center; display: block;"></span>
 </div><figcaption>The full video</figcaption></figure> 
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/UN7OEFyNUkQ" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 Using a unique combination of heartbeat transactions and the NuCypher MockNet, we can enable them to allow decryption of the video only after their demise. We also integrated a small platform on top, through which whistleblowers can assign receipients such as news agencies. Then the recipients stored on the contract can be sent emails with the link of the data stored on IPFS once the video&#8217;s hash stored on the contract is decrypted using our method. A few other examples are people who may be related to influential families or groups, ex-members of cults, people stuck in legal loopholes, or someone who is just afraid that they may die before publishing their findings, such as a whistleblower. In India, there are multitudes of cases, one such example is the Vyapam scam where &#8220;[more than 40 people associated with the scam have died since the story broke in 2013][1]&#8221; many of whom were critical witnesses and whistleblowers whose testimony was lost due to their murder. Our platform, Whistle, hence enables users of our application, to anonymously, store information until their demise.

--- a/content/blog/2018-11-19-streaming-audio-from-linux-to-android-using-pulseaudio-over-lan.md
+++ b/content/blog/2018-11-19-streaming-audio-from-linux-to-android-using-pulseaudio-over-lan.md
@@ -18,15 +18,15 @@ PulseAudio provides streaming via SimpleProtocol on TCP via a simple command. Al

 You can find the source by running this command:

 <pre class="wp-block-code"><code>pactl list | grep "Monitor Source"</code></pre>
 ```pactl list | grep "Monitor Source"```

 After this, you can run:  


 <pre class="wp-block-code"><code>pactl load-module module-simple-protocol-tcp rate=48000 format=s16le channels=2 source=&lt;SOURCE> record=true port=&lt;PORT (eg 8000)></code></pre>
 ```pactl load-module module-simple-protocol-tcp rate=48000 format=s16le channels=2 source=&lt;SOURCE> record=true port=&lt;PORT (eg 8000)>```

 Next, you will need to download PulseDroid, the apk can be found in the Github repository or you can use the following command to download it using wget:

 <pre class="wp-block-code"><code>wget https://github.com/dront78/PulseDroid/raw/master/bin/PulseDroid.apk</code></pre>
 ```wget https://github.com/dront78/PulseDroid/raw/master/bin/PulseDroid.apk```

 Just enter the IP address of your machine (you can find it by running ifconfig) and the port you chose and press the Start button.
--- a/content/blog/2019-01-08-setting-so_reuseport-and-similar-socket-options-in-go-1-11.md
+++ b/content/blog/2019-01-08-setting-so_reuseport-and-similar-socket-options-in-go-1-11.md
@@ -17,7 +17,8 @@ By reading how support for this has been added, we can get an idea about how to

 Let us see how one would start a UDP reader that performs a callback on receiving a packet.

 <pre class="wp-block-code"><code>type UDPOptions struct {
 ```go
 type UDPOptions struct {
 	Address         string
 	MinPacketLength int
 	MaxPacketLength int
@@ -41,11 +42,13 @@ func StartUDPReader(opt UDPOptions, callback func([]byte)) {
 			callback(packet)
 		}
 	}
 }</code></pre>
 }
 ```

 This is how the reader would look after adding SO_REUSEPORT using the new way.

 <pre class="wp-block-code"><code>func StartUDPReader(opt UDPOptions, callback func([]byte)) {
 ```go
 func StartUDPReader(opt UDPOptions, callback func([]byte)) {
 	lc := net.ListenConfig{
 		Control: func(network, address string, c syscall.RawConn) error {
 			var opErr error
@@ -77,7 +80,8 @@ This is how the reader would look after adding SO_REUSEPORT using the new way.
 			callback(packet)
 		}
 	}
 }</code></pre>
 }
 ```

 Using this approach we can reuse the port and have zero downtime, between restarts by starting the new reader before stopping the currently running reader.

--- a/content/blog/2019-02-23-.md
+++ b/content/blog/2019-02-23-.md
@@ -1,11 +0,0 @@
 ---
 title: t
 author: rhnvrm
 type: post
 date: -001-11-30T00:00:00+00:00
 draft: true
 url: blog/?p=459
 categories:
  - notes

 ---
--- a/content/blog/2019-03-17-a-review-of-the-siempo-launcher.md
+++ b/content/blog/2019-03-17-a-review-of-the-siempo-launcher.md
@@ -16,11 +16,9 @@ Last December, I decided to start an experiment and adopt a new launcher called

 After surveying all the options, the only fully featured launcher (that was usable) I found was [Siempo][1]. An other notable mention was the Minimal Launcher but it did not have a free dark mode or even proper app search, making it unusable apart from phone calls and messages. I did not want to go to the extreme with this experiment so Siempo seemed to be the best option out there for Android. A few notable features of this app based on my experience are mentioned below. But before that, I must mention what I guess mostly the ideas on which the app is based on. 

 Tristan Harris, a Former Design Ethicist at Google had around 2-3 years ago started a movement called _[Time Well Spent][2]_ [now called][2] _[Humane Tech][2]._ Nothing better to explain this than his TED Talk on &#8220;How a handful of tech companies control billions of minds every day&#8221;<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
 Tristan Harris, a Former Design Ethicist at Google had around 2-3 years ago started a movement called _[Time Well Spent][2]_ [now called][2] _[Humane Tech][2]._ Nothing better to explain this than his TED Talk on &#8220;How a handful of tech companies control billions of minds every day&#8221;

 <div class="wp-block-embed__wrapper">
  <span class="embed-youtube" style="text-align:center; display: block;"></span>
 </div></figure> 
 <iframe width="100%" height="480" src="https://www.youtube.com/embed/C74amJRp730" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

 A few notable things from the website are copied below for reference

--- a/layouts/index.html
+++ b/layouts/index.html
@@ -44,12 +44,12 @@
  <section>
    <b>categories</b>: 
    {{range ($.Site.GetPage "taxonomyTerm" "categories").Pages }}
      <a href="{{.Permalink}}">{{.Title}}</a>
      <a href="{{.Permalink}}">{{lower .Title}}</a>
    {{end}}
    <br><br>
    <b>tags</b>: 
    {{range ($.Site.GetPage "taxonomyTerm" "tags").Pages }}
      <a href="{{.Permalink}}">{{.Title}}</a>
      <a href="{{.Permalink}}">{{lower .Title}}</a>
    {{end}}
  </section>

--- a/layouts/section/blog_list.html
+++ b/layouts/section/blog_list.html
@@ -1,5 +1,5 @@
 {{ define "title" -}}
  {{ .Site.Title }}
  Blog List | {{ .Site.Title }}
 {{- end }}
 {{ define "header" }}
  {{ partial "masthead.html" . }}
@@ -12,8 +12,10 @@
  <section>
    <ul class="all-posts">
    {{range .Site.RegularPages}}
    {{if .Date}}
    <li>{{.Date.Format "2006-01-02"}} <a href="{{.Permalink}}">{{.Title}}</a></li>
    {{end}}
    {{end}}
    </ul>
  </section>