Tesla Personal Supercomputer by Nividia

nukethewhales31 · May 18, 2009

Quote from LAtoLV:

Billed as the first personal supercomputer for under $10,000.

http://www.nvidia.com/object/personal_supercomputing.html

Me likes. Me wants. Me buy?

Great so windows takes less time in between lock ups

intradaybill · May 18, 2009

Quote from jprad:

Charging you extra for what amounts to an amateurish hack shouldn't be tolerated in something that's advertised for trading professionals.

I agree. But it still does what I want even with the hack so I don't care how it's done. I don't know that much about parallel computing anyway.

Quote from jprad:

Actually, there's a special term for this sort of problem and it's treated fairly well here:

http://en.wikipedia.org/wiki/Embarrassingly_parallel

Maybe my knowledge is too limited on this subject but I don't see how the example I gave is embarassingly parallel when the evaluation of one variable depends on the value of another variable.

Can you explain how to make something like the example I gave you run in multiple threads?

AS far as APS I understand that searching for patterns in 2 data files can run in parallel but the question is whether they can do a single search to run in paraller processing mode. In fact, all I do is a signle search at a time.

dcraig · May 18, 2009

Quote from vikana:

Personally, it'd rather see a board with 100 386-instruction set CPUs where normal software would have an easier time of exploiting the parallelism.

That is just what Intel's Larrabee is about:

http://www.ddj.com/architect/216402188

This could be very interesting.

jprad · May 18, 2009

Quote from intradaybill:

Maybe my knowledge is too limited on this subject but I don't see how the example I gave is embarassingly parallel when the evaluation of one variable depends on the value of another variable.

No, your example is pretty straightforward, and it can be parallelized. But, it's easier to think of all this in terms of atomic functions. Let's start with:

a = f(b)
c = f(d)

Since the dependent variables in each, 'a' and 'c' are independent of each other their functions can be parallelized.

On the other hand, the sequence:

a = f(b)
c = f(a)

cannot be parallelized since 'a' has be be computed first because 'c' is now dependent on 'a.' (and no, 'y' in your example isn't dependant on 'x' in the same was as here since the value of 'x' is constant during the entire iteration of the inner loop.)

AS far as APS I understand that searching for patterns in 2 data files can run in parallel but the question is whether they can do a single search to run in paraller processing mode. In fact, all I do is a signle search at a time.

From a functional perspective why would the input to a function that searches for a cup w/handle pattern be dependent on the output from a function that searches for a head & shoulders pattern?

The only possibility is poor program design with the use of global variables almost always at the top of that list.

vikana · May 18, 2009

Quote from dcraig:

That is just what Intel's Larrabee is about:

http://www.ddj.com/architect/216402188

This could be very interesting.

thanks for the link to the Mike Abrash article. He knows his stuff!

jimbojim · May 19, 2009

Quote from jprad:

No, your example is pretty straightforward, and it can be parallelized. But, it's easier to think of all this in terms of atomic functions. Let's start with:

a = f(b)
c = f(d)

Since the dependent variables in each, 'a' and 'c' are independent of each other their functions can be parallelized.

On the other hand, the sequence:

a = f(b)
c = f(a)

cannot be parallelized since 'a' has be be computed first because 'c' is now dependent on 'a.' (and no, 'y' in your example isn't dependant on 'x' in the same was as here since the value of 'x' is constant during the entire iteration of the inner loop.)

WTF you bozo retard, wiki freak.

Look at his example carefully:

x = 0.
y=0.
for i = 0 to 100
x = x+i
for j = 1 to 1000
y = x+2j
end
end

This translates to:

a = f(b) // b = i
c = g(a,d) // d = j

Calculation of c is dependent on a. This cannot be parallelized (easily).

Bozo...

Hugin · May 19, 2009

Quote from vikana:

The biggest issue/problem with the Tesla architecture is that you have to re-design your software around their APIs. For some that's easy, but for many, it's probably not a good fit.

If your software already is highly distributed and parallel without lots of locking, cuda might fit. Otherwise, it's a bit project to support it.

Haven't done any real work with CUDA yet but I have read the documentation as well as visited their forum.

To gain a large performance boost you do need to parallelize your algorithm, but unfortunately that is not enough. Equally important are the memory access patterns of your algorithm. The GPU reads data in blocks and if your problem does not map to the access pattern it will need to synchronize which will slow things down a lot. For complex algorithms it seems this could become harder than making the algorithm parallel.

/Hugin

byteme · May 19, 2009

Quote from jimbojim:

This cannot be parallelized (easily).

Bozo...

Execute 101,000 threads in parallel with the following kernel:

y = (i*(i+1)/2) + 2j

Where:

i is an input from 0 to 100 and
j is an input from 1 to 1000

Analagous to pixel rendering for each x, y co-ordinate on the screen where the screen is 101 by 1000 pixels in dimensions i.e. perfectly suited to parallelism.

gasy · May 19, 2009

lot of respect for NVIDIA after this, for making it affordable that is

jprad · May 19, 2009

Quote from jimbojim:

WTF you bozo retard, wiki freak.

Look at his example carefully:

I did, but since you insist...

Code:

main()
{
  x = 0
  y = array[100]

  for i = 0 to 100
    x = x+1
    fork_thread(i, proc_x(y, i, x))
  end

  wait_thread(100)
  print(x, y[100])
}

proc_x(array y, int i, int x)
{
  for j = 1 to 1000
    y[i] = x+2j
  next

  return
}

Both fork_thread() and wait_thread() are OS dependant. A decent treatment can be found on wiki, but you don't seem open to that. So, here's one of the books that I've got, about 10 years old by now:

http://www.amazon.com/Win32-Multith...=sr_1_8?ie=UTF8&s=books&qid=1242728857&sr=8-8

Bozo...

Dipstick...