Cluster data expansion and implicit surfaces

Ok, boys and girls. So I have been looking very actively into the idea of cluster based data expansion. Sure, Velocity transfer can nail very interesting results, especially since it provides us with kind of emergent behaviour, stemming from the fact that new velocity is being computed per each particle per frame, but.. well there is a but. For this whole thing to work we need to run from frame 0 to frame N and let it cook again , computing all the velocity transfer that is happening. This is good and bad. It is good because it does not matter what computing time your original simulation takes. My algorithm depends only on the count of particles and of course some other little factors like frequency of rebuilding the KD tree.
 
Now.. I was looking at the way they handle volume generation base on pointclouds in Sony. Obviously , they have some pretty bright guys over there and they are using the so called clustering system. What is unique about it is that is very relative. It defines relationships between points in space and build dynamic connections between them. What that means is that each seed particle serves as a node around which connections are build and those connections – populated with new data. When all of that mass moves, the connections scale and the data shifts, again providing us with some kind of dynamic behaviour. What is also great about this system is that it does not need to simulate. So i can expand my 50th frame, look at it , go back to 49 expand it to and they should be seamless. The SIGGRAPH paper that I was reading which elaborates on this technique does not provide a lot of details of the implementation. Currently I am working on a version, written in C++ that is targeted towards Houdini and Pixar Renderman. I will post some results very soon. Another cool idea that my friend Stu suggested is the use of implicit surfaces to define the volume in which my new data is being generated. This should be pretty interesting too. So I am just writing this brief post as a way to say that You, who are reading that should stay tuned, because a lot of exciting stuff is coming very soon. ( I mean those guys at Sony didn’t release the code right ? )

Posted in Uncategorized | Comments Off

Velocity transfer particle data expansion

In the world of computing there is the idea of approximation through speculation. People do it all the time… Using historic patterns of the market to predict future short term moves, using system load measurements to create disaster recovery plans and so on. The concept that if a value has been in the a certain limit the last couple of iterations, then the probability of this limit being broken on the next iteration is lower then the probability of the value staying within the limit is, is a pretty widely used one, but fairly wrong. This is due to the fact that most of the cases when this notion is applied are driven by almost pure randomness.

 

Speculation may be the mother of all evil when it comes to stocks trading and things of such nature, but with computer graphics it comes as the solution to multiple problems. Data enhancement is used by multiple companies to produce effects that are too hard to simulate with an algorithm that heavily approximates physical correctness. This is why for the effects of Alice in Wonderland, the R&D team of the studio that was responsible for particles were heavily using data expansion. They were simulating the so called seed particles and then from those seed particles extrapolating an expanded simulation that contained much more particles. .

Often times the data expansion algorithms are driven by rather primitive logic. A lot of studios use multi point expansion. This is pretty much copying a cluster of static particles ( or moving in some looping manner) onto the seed particles, making the cluster follow the seed. This produces clumps – like look that is unnatural and not very successful. This is shy I decided to experiment with a technique called Velocity transfer.

You can see on the diagram that what is happening is a little more sophisticated than simply copying clusters of point onto other points. What I am doing is injecting a number of points for every seed. Then those points are kept in memory as on every frame a KD spatial tree is constructed. Every one of the newly inserted points is query the tree for its N nearest neighbours. Then a ramp is constructed and the velocity vector of our hero particle is computed, taking into consideration the relative distance of its N neighbours. This is recomputed on every frame. This way we can achieve data expansion that does not produce strange results. Moreover we can generate some pretty interesting emergent behaviour that becomes apparent when an animation is played.

 

 

The video above was rendered by me. I expanded 2000 particles up to 4000000. The results get even better when we go into the tens of millions count. am currently working on a technique that does not involve keeping track of where the particles are in time and is completely frame independent so stay tunes. For the C code or any questions, you can contact me at zaharidichev@gmail.com

 

Posted in Uncategorized | Comments Off

Shortest path problem in the third dimension

I decided that it would be pretty interesting to see how the shortest pathS problem looks in 3d.It is not too hard to employ the same logic that I used for the solution of this problem to a grid that is actually three dimensional. In 3d space it would be very easy to animate the paths slowly growing from one corner to the other of the cube. This problem actually inspired me to work a little bit on that last week. Here You can see a picture of the possibilities of paths in a 2 by 3 grid. I used python`s turtle module to draw the paths. Using recursive permutations and exclusion quickly found all the variants that I needed. )

I used SideFX Houdini1s Python API to generate a visual representation of the possible paths through a 3d cube. The logic is pretty much the same like in the 2d version. It is just that the possible combinations are much much more.

Posted in Uncategorized | Comments Off

Shortest path problem

Suppose you have a wire mesh which is N by M units long, made up of unit square with wire at the edges. (So there are N+1 parallel wires all M long and, perpendicular to these, M+1 all N long). An ant starts off at the bottom left corner of this grid (co-ordinates (0,0) and crawls on the wires the shortest possible distance to reach the top-right corner (N,M). How long is the shortest route? How many different shortest routes are there? (Namely, find a formula in terms of N and M)


To begin with, I assume that the shortest way will be a way in which no backtracking is an option. Lets think of a grid where N is 3 and M is 2. The shortest way possible, if we start from cordinats 0,0 and aim at 3,2 is to travel three units to the right and two units up. If we want to exhaust all the possible options of traveling that amount of distance we need to think about the moves as binary values. Lets say that 1 is UP and 0 is right. Then One possible combination would be: 10001. This will get us to the top corner ( or rather get the ant there).

 

Now what can we change in that set to make it different but preserve the outcome. We know that we have a total of 5 steps that need to be performed in order to reach the top corner. From those five steps two need to be UP or 1s and we could leave the rest being lefts or 0s. SO if we assume , we have 5 slots for move commands, we need to put two UP commands and fill the rest with RIGHT commands. In this case our set is the total sum of the up and left moves that we need to perform and n is the number of up moves. This problem is easily solved with combinations. First we find the total possibilities for this slots to be filled – 5!/ (5-2)!. Dividing by the factorial of 2 gives us the total UNIQUE patterns which are 10.

 

So lets get back to our grid… We simply use the combination formula described above. and from it we can derive that the total shortest ways in a grid the size of N by M for N > M can be found by calculating: ((N + M)!/N!)/M! which can also be described as (N+M choose M).

Another interesting observation is that when we take each vertex of our grid and assign to it the value that equals the total shortest path ,we can see the Pascal triangle. Thus we can solve this problem by expanding a Pascal triangle over the vertices of the grid. The method for expanding the Pascal’s triangle includes a statement that explains that if an element is missing from the above we can substitute it with zero. Another thing to notice is the fact that We already know some of the values that are on our vertices. The ones that are straight up and straight to the right. There are only 1 shortest path for each of them to be reached.

 

Posted in Uncategorized | Comments Off

Volume slicing in C++

So, in relation to my previous posts, I decided to post some code that i put together. To some it might appear a little sluggish to use python for so much processing ( that takes a little while) . Thats why I decided to do some testing with C++ , which I am still in the process of learning. The piece of code that I am sharing is for slicing the volume files that I created earlier and producing an image. You need to have pngwriter compiled on your system for this snippet to work ( I compiled on Centos 6 and works just fine).  What I am doing here is essentially reading the file and then putting it into a vector of vectors of vectors structure in order to emulate three dimensional array. Truth is , I am really fond of the ndarray that SciPy provides and was kind of disappointed that there is no ready - made data type in C++ supporting the huge number of methods that are present in the scientific computing module for python ( Hello ? Where is the ndarray.roll() function). Anyway, with some homebrewing, you can put together a usefulf datatype to work with scalar data in C++. Not to mention that there are some more advanced libraries that provide some of the functionality already. There was even a project on  ( an open source one), that tries to mimic the ndarray from SciPy - take a look here I decided to stick with the old clustering method for doing the trick. I am really new to C++ and although I see its great power , I still dont feel very comfortable with it so my code is most likely full of bugs , redundancy and mistakes but I guess that is how people learn right ? Another thing to mention is that I found a pretty interesting function online. It is for tokenizing strings by delimiter. It proved to be very useful for reading the volume file. You may ask why in the world I am reading the whole file and not breaking from the loop when reach the slice that I am writing to an image file. Simple enough. - I decided to leave it like that because I might find some more time to implement some of the things that I did in the python version of my experiment and for this purpose I need all the data available ( filters and other things).  Take a look below where I am posting the tokenize function. It is pretty useful. For the full version of the code , you cna go to Active state and check it out by clicking here
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void Tokenize(const string& str,
vector& tokens,
const string& delimiters = " ")
{
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
}
 
Posted in Uncategorized | Comments Off

Volumetric tools for Python

So following my interest in the topic of MRI visualization  and my passion for 3d, I decided to write a little collection of tools that would be useful when dealing with volumetric data.  To begin with I need to actually specify some kind of convention in terms of a file format so  I can work with the the data and know what I am looking at. I originally started working with the Stanford volume data library. They have a couple of scans of brains and skulls which were 256X256 by 109 slices.  Later on I wanted to add a little more resolution to what I was working with so I moved on to working with some data from the http://www.mr-tip.com/. The thing about this type of data source is that I could only get different PNG files with the slices. Then extracting the values  was simple with a little python script.

To elaborate on the file specification a little bit here is a quick diagram of how things actually look like in my file.  I wanted to actually consolidate everything into one single file where I will have all of my slices and will be able to quickly extract ti work with it and visualize the result in the end. After data is interpreted it gets written as a sequence of strings that  are separated by “;”. The template for the entry looks like:

Xcord   Ycord   Zcord  VoxelValue ;

 

Since this is all ASCII data and its a lot of it the files got pretty big. That’s why I used the bzip compression library to deflate the files and make them more manageable. I must say that it worked pretty well and I got small compact  archives to work with.

I also wanted a tool that will be working under SideFX Houdini software and will be exporting the  volumetric data into the same file type that my Python module can read and work with.  Some of You my question why I am using both C++ and Python in this particular case. Well, while the python module is meant to actually work independently of an application, It can work under Maya or Houdini or just by itself, the C++ Houdini plug-in is  just for exporting the data from Volume primitives into my standard  file format. I chose c++ because the Houdini API for it is pretty good and of course because of the speed. Also coding some c++ was a great exercise.

So what is in the module ? Well, there are a couple  of functions to perform basic tasks with the volume data. There is read , write ( in slices) there  are also some filters one of which is anisotropic diffusion. The anisotropic diffusion algorithm is pretty interesting as it eliminates noise in the data and produces a cleaner image while preserving the edges. I will later post some code but for now, here is some images after processing the data.  I will soon upload a PDF, explaining in details the filters and approaches that I used and why I used it. A quick note that I want to make is that I am using the ndimage  module which is a part of the SciPy library. This particular piece of software is a life saver when it come to dealing with scalar type data ( aka Nd arrays) as in the particular case. Bellow ,You can take a look at some of the results and intersection graph to get a better idea of how the data reacts upon filtering.

Posted in Uncategorized | Comments Off

Linux install on the cloud

So I completed this little task at work recently and I decided to blog about it because it seems that it can be helpful for a lot of people in the same situation. To render the whole picture – I am working as a work – study system administrator for SCAD Network Services in the Digital Meadia department at my university and we are supporting a good number of dual boot Windows/Redhat workstations. So we need a solution that can deploy the operating system in a fairly unattended fashion. We have this option for windows since we are using Microsoft WDS. So what about Linux then ?

 

Well we have a kickstart installation that hangs on USB drive and whenever we plug it into the station, we can boot through USB and it will kickstart the installation along with postscripts and whatever needs to be customized for the system. Now… I had the idea that we can put this kickstart setup on a cloud and cheat the WDS to boot into linux instead of Windows if we want. Needless to say this would be pretty convenient – to be able to reimage any workstation anywhere just over network and be able to choose whether we want to start a windows or linux reimaging process. So Steve Abrams ( who is about the smartest windows guy I have ever met) said that its possible to do so and we should try it out. So let me show you a quick schematic view of how the whole setup looks like and then I will go into more details concerning technical aspects.

As You see from the little diagram , there are two servers that are in the current setup ( of course one can do it with only one server). So when the machine is booted into network mode a menu pops out prompting you to choose between Windows or Linux installation. Choosing linux is booting the kernel from the Redhat 5.5 installation and is starting the installation in a kickstart mode , pulling the kickstart file from the http. NFS server , which also happens to store all of the other files that we need for the installation ( postscripts , conf files rpms and in the particular case even a custom repository for installing the packages that we need here at campus). So the whole installation can be summarized as follows:
 
1. Booting through PXE and loading the kernel
2. Looking at the kickstart file and performing installation
3 Running postscripts that polish a lot of the system`s setting for our purpose

 
This particular scenario is really useful. Our particular setup is a lot more sophisticated than what I am explaining here, but I cannot go into extreme details due to privacy issues. The beauty of this particular design is that it hold everything in a centralized location. Whenever we need to change any of the postscripts or the installation packages, we can simply edit the ks file and this one ks file will end up being used by every machine on the network. Additionally this server can hold updates scripts, config files and so on. If you want to read a little more step by stem oriented tutorial, I advice You to take a look right here.

Posted in Uncategorized | Comments Off

Parsing raw binary MRI data to Houdini (python)

Magnetic resonance imaging (MRI) is an imaging technique used primarily in medical settings to produce high quality images of the inside of the human body. MRI is based on the principles of nuclear magnetic resonance (NMR), a spectroscopic technique used by scientists to obtain microscopic chemical and physical information about molecules. The technique was called magnetic resonance imaging rather than nuclear magnetic resonance imaging (NMRI) because of the negative connotations associated with the word nuclear in the late 1970′s. MRI started out as a tomographic imaging technique, that is it produced an image of the NMR signal in a thin slice through the human body. MRI has advanced beyond a tomographic imaging technique to a volume imaging technique.
 
The raw MRI data usually comes in a sequence of files for the different slices. Some of them have headers and some just rely on additional file where the type of data that the RAW contains is explained. I wrote a Houdini Python SOP that reads the MRI data and writes voxel values to a volume primitive so the data can be visualized after that in the viewport ( and eventually raymarched). I releid on the Stanford Volume Library, where they have a set of MRI scans done a while ago. I worked with the Brain set which was 109 slices of 256 x 256 pixels and contained the data as 16-bit integers (short). The script is nothing complicated, just simple data parsing. The unpacking of the data was done with the struct module for python which is a very handy library for working with binary type data. This was a pretty basic exercise in parsing the data. It got me interested though in the problems that are present when dealing with this type of information. I will be posting some other updates on my current experiment on filtering noisy MRI scans using gauss and anisotropic diffusion filtering.
 

Posted in Uncategorized | Comments Off

Option pricing and stock simulation with Python ( Black Scholes )

Recently I have been really interested in the behaviour of Stocks. People have been trying to use mathematical models to predict the movement of stocks for years now without real success. I decided that it would be a good exercise to write a script that employs the Black Scholes option pricing model.

The Black Scholes model for pricing stock options was developed by Fischer Black, Myron Scholes and Robert Merton in the early 1970’s. It is arguably the most important result in financial engineering, and is certainly a rich source of interview questions in the financial services industry.
The basic idea of it is that it tries to value a contract whose value depends upon the value of a particular stock N years from now. My implementation of the model uses Historical Volatility to compute the standard deviation for a particular period of time ( 1 Year ). I utilized two classes in the design of my program – Stock an Option. This is how the result looks like. This is pricing of an option with Historical Volatility (yearly) of 9.3% , risk free interest rate of 5% and expiration time of 3 months. and a price 5 units less than the current price.
A valid question that comes is where I got the data i order to compute my Historical Volatility from. I designed a stock market simulation in python that moves the stock using a Random Walk algorithm. The simulation has different features like Disaster occurrence and Momentum. Te simulation follows the idea of he efficient-market hypothesis , thus giving the attribute of momentum, one can control whether the concept is of semi strong or strong efficiency. The data from the simulation then is stored so the option pricing model can work with it and compute the Yearly Volatility. I will be uploading a detailed breakdown of the program soon in the form of a PDF, in which I will explain both the theory and some technical consideration of my particular implementation.
Posted in Uncategorized | Comments Off

Emergent behaviour simulation with Python

I have been interested in simulations of any kind recently. The idea of designing or implementing a model that is meant to mimic reality in a certain way is just fascinating to me. There are plenty of algorithms which purpose is to replicate a phenomena seen in real life – from systems that model the shapes of snowflakes to simulations trying to mimic the behaviour of the stock market ( as you saw in my previous post ).

 

The propose of this project is to create a system that is based on the BOIDS algorithm.  Boids is a mathematical model , developed by Craig Reynolds in 1986, which simulates the flocking behaviour of birds.  This approach assumes a flock is simply the result of the interaction between the behaviors of individual birds. To simulate a flock we simulate the behavior of an individual bird (or at least that portion of the bird’s behavior that allows it to participate in a flock). To support this behavioral “control structure,” we must also simulate portions of the bird’s perceptual mechanisms and aspects of the physics of aerodynamic flight. If this simulated bird model has the correct flock-member behavior, all that should be required to create a simulated flock is to create some instances of the simulated bird model and allow them to interact. In its core the system is driven by three main rules ( although additional ones can me implemented):

 

1 Cohesion – Boids try to fly towards the centre of mass of neighbouring boids

 

2 Separation – Boids try to keep a small distance away from other objects (including other boids)

 

3Alignment – Boids try to match velocity with near boids.

In order to implement this system I chose Python, because of its simplicity and ease for prototyping code. I also designed my code o it can be both standalone decision, outputting cache files with coordinates and a plugin for Houdini Animation tools. I later saw that implementation in Houdini was a very nice decision since it gave me a way of quickly and interactively add interface controls and preview the behaviour of my agents in the viewport. Moreover if I want to render a high quality animation with textures and proper lighting I can always bake the simulation and render. One of the problems of python especially with a program that is so computationally intense was speed. That`s why I implemented an space partitioning algorithm ( KD TREE ) in order to be able to process more heavy simulations. Below is a preview of the plugin that I wrote in the form of a video file. I will upload a paper really soon in which I will go over the design and implementation in more details, explaining my class hierarchy and some other technical considerations when I was designing the program.

 

Posted in Uncategorized | Comments Off