Fast filtering in Octave / Matlab by vectorizing

Joofa · Feb 4, 2017

JimKasson said:
Joofa said:

JimKasson said:

Joofa said:

Speed up = 38.8481 / 0.0683351 = 568.49

Click to expand...

A bit less dramatic for a 1024x1024 image in Matlab.

Thanks JIm. Appreciate that. However, I shall consider an improvement that is more than 100 times faster as still being dramatic.

Click to expand...

Click to expand...

Don't get me wrong, I'm not complaining at all. I did say less dramatic, not not dramatic.

Of course, I understand Jim.

JimKasson said:
What size image did you use for your timings?

I mentioned in some of the earlier messages that Goofs image used is 800 x 591. Below, I have profiled the min/max vectorized filtering code in Octave for square images of various sizes in megapixels:

Profiling for vectorized Min/Max outlier filtering. Y-axis is time in seconds. X-axis in megapixels.

Seems to be a little linear like.

--
Dj Joofa
http://www.djjoofa.com

Joofa · Feb 4, 2017

AiryDiscus said:
Note that there are two reasons this will increase performance:

(1) better optimization in the matlab / octave Just In Time (JIT) compiler

(2) The use of Single Instruction Multiple Data (SIMD) instructions for vectorized code. In essence, x86 and x86-64 processors have for a number of years been able to internally parallelize operations. Usually 4 or 8x, but for some very small number types up to 64x.

If performance is increased 500x+, the compiler that is interpreting the code is very, very badly optimized.

https://en.wikipedia.org/wiki/SIMD

SIMD is very cool. If you move from a single threaded and optimized algorithm, going to SIMD+multithreading gives you numLogicalCores*4 improvement usually. For my computer with a 4core/8thread CPU, in essence you can create a "free" ~30x speedup.

Jim Kasson found out that Matlab speed gain is still over 100! I guess that means that there is room for improvement in Matlab also.

https://www.dpreview.com/forums/post/59076679

SIMD type stuff helps. However, not algorithms are inherently able to take advantage from parallelism in general than some other are.

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

For data processing using cloud storage the layout certainly matters. That is the reason for the trend in columar storage in many modern databases. For machine learning and data science work that helps in a denormalized view when not all columns (features or patterns) are utilized for an algorithm to operate upon, but a chunk of those columns' data is needed.

--
Dj Joofa
http://www.djjoofa.com

AiryDiscus · Feb 4, 2017

Joofa said:
AiryDiscus said:

Note that there are two reasons this will increase performance:

(1) better optimization in the matlab / octave Just In Time (JIT) compiler

(2) The use of Single Instruction Multiple Data (SIMD) instructions for vectorized code. In essence, x86 and x86-64 processors have for a number of years been able to internally parallelize operations. Usually 4 or 8x, but for some very small number types up to 64x.

If performance is increased 500x+, the compiler that is interpreting the code is very, very badly optimized.

https://en.wikipedia.org/wiki/SIMD

SIMD is very cool. If you move from a single threaded and optimized algorithm, going to SIMD+multithreading gives you numLogicalCores*4 improvement usually. For my computer with a 4core/8thread CPU, in essence you can create a "free" ~30x speedup.

Click to expand...

Jim Kasson found out that Matlab speed gain is still over 100! I guess that means that there is room for improvement in Matlab also.

https://www.dpreview.com/forums/post/59076679

SIMD type stuff helps. However, not algorithms are inherently able to take advantage from parallelism in general than some other are.

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

For data processing using cloud storage the layout certainly matters. That is the reason for the trend in columar storage in many modern databases. For machine learning and data science work that helps in a denormalized view when not all columns (features or patterns) are utilized for an algorithm to operate upon, but a chunk of those columns' data is needed.

Joofa · Feb 4, 2017

AiryDiscus said:
Don't get me wrong - I'm not saying the performance benefit is great. it is. I mean to express that if the speedup is 500x,

For an equivalent comparison at the same resolution JIm K. was operating the speedup factor ratio between Octave and Matlab is 3x - Jim reported about 121 speedup for Matlab and for the same resolution I noticed about ~360x for Octave.

Please see the graph in the link below that shows the time it takes for one of the vectorized filters to operate at various image sizes.

https://www.dpreview.com/forums/post/59077181

Member said:
the prior art is written in a way that is difficult for the compiler to understand, or the compiler is not well implemented for speed.

Yes, that may be true. However, I don't think it is necessarily the code generation part of the compiler. IMHO, it could be the parser part. In fact it appears from the timing graph in the above link that Octave behaves kind of linearly as image size is increased. At least not nonlinearly high as image sizes increase. I would think that perhaps parsing in the interpreter needs to be optimized more.

Many people don't seem to realize the difference in effective parsing methodologies. That is why I'm a big fan of all CS and even EE students taking the compiler course, which unfortunately at many universities is optional. Not for the code generation part per se, but the parsing methodologies that they teach in this course using powerful tools such as bison/flex, etc. With the advent of big data I feel a need for more effort in parsing methodologies for distributed workflows and data pipelines. However, many students don't realize that and in fact like to skip this very useful course as it is considered difficult.

On a different note, at the heart, Octave, like many other open source GPL'd packages, uses standard high performance libraries such as FFTW3, Lapack, etc. Hence, many parts of their code that runs at native speed on the hardware should be fast enough.

Member said:
If Matlab is 5x faster than Octave (or so) that would suggest Matlab's compiler is a lot better than the Octave compiler. That suggests matlab > octave for image processing, which is among the most difficult tasks in computing w.r.t. time.

Matlab parser is perhaps faster. Not necessarily code execution. I would expect a commercial company in business for long time to have decent code generating and optimizing compiler.

However, still not a reason to buy Matlab when Octave is free.

The case has to be more compelling.

Member said:
Of course, matlab is likely much slower than "real code." I once got a non FFT convolution-based image processing algorithm working in C#, verified results and all, which could do 4K monochromatic images almost fast enough for video (~25ms/frame). It used SIMD+multithreading. If I compiled the code and called the functions through matlab's ability to hook into .dll files, it took over 40s and each thread utilized only about 4% of its core. I am not sure how it is possible for it to perform so badly at that.

Interesting.

Member said:
Years ago I learned Ruby, later Python. Freshmen in optics are required to take a matlab course, so I learned matlab. The language is brilliant for an engineer, as the syntax is fairly "free" and doesn't have much "programmy stuff" like the var keyword, etc.

These days I'm gravitating towards Python. Many of the parser related slowness we are taking about here due to Matlab / Octave being an interpreter-like environment might go away. And, Python comes with other advantages also. However, an issue with Python is that multi-threading is broken. But, so many young programmers don't even realize what that means for them

And, there are some things in Python just unique to that language that are very interesting for newer, cloud-based, distributed, modern data flow-based pipelines.

Another issue is intellectual property. With Matlab (and also R) the scripts themselves are open to anybody. There are ways to get around that but they are more like hacks. Python compiled code should obviate that issue.

Member said:
If you want speed and don't need to share with these "non programmers" you can always write in C with possibly great pain, or C++ with a bit less pain.

I love C++. But, not enough takers these days. And, QT is absolutely amazing.

Member said:
These languages are the "1x [execution] time cost" tier in terms of performance. These days there are a number of newer entrants (rust, golang, scala) that compete with the "old hat" C# for the "4x time cost." Google's V8 javascript compiler is also phenomenally good, and has javascript running in comparable speed to C#, at times even faster.

Yes, they did that to have so called 'full stack' front-end to back-end Javascript based developers. So that they could do everything end to end in Javascript. But, Javascript is terrible as a language.

Member said:
C# has a beautifully simple way to convert for() loops to Parallel.For loops that offers 80%+ the performance of manually managing the threading with nearly zero effort from the programmer. That's a huge win and lets you parallelize damn near everything. I am not so familiar with rust, go, or scala as they are mostly server languages.

C# is a reasonable language. However, IMHO, Python will eventually kill it.

Though, MS Visual Studio has a great IDE. One of the best that I like.

Member said:
I do know in Javascript there are now ways to spin off more processes for multithreading, but it is a manual process.

They are trying. But you can't convert a heyna (Javascript) into a lion (Python).

Member said:
Manual multithreading is a pain, and many workloads are very difficult to multithread this way. In the median filter example, you need to send each worker a chunk of the image with some padding (the kernel radius) but only process the chunk without the padding. My eyes are rolling at the thought of writing that code.

It is often said that the matlab compiler takes advantage of SIMD for vectorized functions - see e.g. http://stackoverflow.com/questions/12615309/how-does-matlab-vectorized-code-work-under-the-hood

You can expect between 4-64x improvement to be due to the ability of the compiler to turn your code into SIMD instructions. Images are usually UInt8s, so I would expect up to 64x from SIMD alone.

Matlab and Octave being very high level languages do not expose any sort of access to SIMD vs non SIMD to the programmer.

What a lot of signal processing people don't realize many times is that the world out there is not alway amenable to parallel code. They are used to embarrassingly parallel algorithms that are easy to cast into parallel code. But, given the issues in a real distributed computing environment it is not always easy to parallelize stuff as one might think.

And, it is not just GPU computing that is blissfully accommodating of embarrassingly parallel code. Many standard workflows in distributed computing frameworks such as Hadoop and Spark are also geared towards such scenarios. Though, for text based inputs Spark is just killing Hadoop (MapReduce, not the Hadoop env such as HDFS, etc.) But that is a different story.

Member said:
Assuming you are working 'purely' with variables in RAM, the response time of RAM is in the picosecond domain. It is true that loading millions or billions of bytes can bring this access time up to the millisecond domain, but 1mp images are not large enough that the speed of RAM is likely to be noticeable.

CPU caching (L1, L2 cache, etc. and their sizes) are also an important issue. That is where data layouts that I mentioned before such as (row/column based) can make a difference for some operations.

Member said:
For image processing unless you need floats I would say speed goes GPU >> SIMD+Multithreaded C#/JS/Go/Rust/Scala >> SIMD Matlab > SIMD Octave (?) >> non-SIMD Matlab >> Octave

(>> being ~5x speedup or more)

Now we have to think about in the totality of distributed systems also that include GPU, etc. as subcomponents in a larger system. World has grown outside computation in a single, isolated environment. See below.

Member said:
I do wonder if it would actually be faster to implement expensive image processing algorithms in a fast, parallel+simd language and expose them behind a server.

Now we are talking

This is something I've been doing for some time.

Member said:
If the requests don't leave the machine they should be served (transfer included) in a few milliseconds. Then you could do these things so very quickly with the same accessibility as matlab/octave, but all the speed benefits of faster, "more programm-y" languages.

A lot of secret sauce here. But you are thinking in right direction.

--
Dj Joofa
http://www.djjoofa.com

alanr0 · Feb 4, 2017

Joofa said:
AiryDiscus said:

Note that there are two reasons this will increase performance:

(1) better optimization in the matlab / octave Just In Time (JIT) compiler

(2) The use of Single Instruction Multiple Data (SIMD) instructions for vectorized code. In essence, x86 and x86-64 processors have for a number of years been able to internally parallelize operations. Usually 4 or 8x, but for some very small number types up to 64x.

If performance is increased 500x+, the compiler that is interpreting the code is very, very badly optimized.

https://en.wikipedia.org/wiki/SIMD

SIMD is very cool. If you move from a single threaded and optimized algorithm, going to SIMD+multithreading gives you numLogicalCores*4 improvement usually. For my computer with a 4core/8thread CPU, in essence you can create a "free" ~30x speedup.

Click to expand...

Jim Kasson found out that Matlab speed gain is still over 100! I guess that means that there is room for improvement in Matlab also.

https://www.dpreview.com/forums/post/59076679

As far as I can tell, both Octave and Matlab are interpreted. There is significant overhead for each executable line of source code added. My experience implementing other code is that the Matlab interpreter is much more highly optimised and more tolerant of code which fails to exploit vectorisation.

Recent versions of Matlab support Just in Time compilation. There is experimental JIT support in Octave, but it is not enabled by default.

Python is byte-code compiled, which reduces the interpreter overhead. There is some JIT support via an external module, but last time I looked, this was for Python 2 rather than Python 3.

Joofa said:
SIMD type stuff helps. However, not algorithms are inherently able to take advantage from parallelism in general than some other are.

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

According to your Wikipedia link, Matlab and Octave both follow their Fortran roots and uses column-major order. Row-major is the default for Numpy (Python).

Joofa · Feb 4, 2017

alanr0 said:
Joofa said:

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

Click to expand...

According to your Wikipedia link, Matlab and Octave both follow their Fortran roots and uses column-major order. Row-major is the default for Numpy (Python).

Sorry, typo when I meant column major instead of row major for Matlab and Octave. Thanks for noticing that. However, I'm surprised that Numpy is row-major.

--
Dj Joofa
http://www.djjoofa.com

Joofa · Feb 4, 2017

Joofa said:
alanr0 said:

Joofa said:

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

Click to expand...

According to your Wikipedia link, Matlab and Octave both follow their Fortran roots and uses column-major order. Row-major is the default for Numpy (Python).

Click to expand...

Sorry, typo when I meant column major instead of row major for Matlab and Octave. Thanks for noticing that. However, I'm surprised that Numpy is row-major.

It appears that the default in numpy is row major. However, it seems that it is possible to change ordering between row and column major:

https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.asfortranarray.html

hjulenissen · Feb 5, 2017

AiryDiscus said:
Don't get me wrong - I'm not saying the performance benefit is great. it is. I mean to express that if the speedup is 500x, the prior art is written in a way that is difficult for the compiler to understand, or the compiler is not well implemented for speed. The prior art itself could also incur unnecessary operations which may be problematic, but I do not think that is the case here.

I wrote a jpeg decoder for MATLAB that used 70-ish seconds for a 1 megapixel image. Some tasks are inherently difficult to implement efficiently using a matrix algebra/JIT/... oriented framework. Mathworks have done many such things in C/Fortran and exposed the function as a native MATLAB function (often depending themselves on 3rd party libraries). You can do the same using using the mex wrapping.

Member said:
If Matlab is 5x faster than Octave (or so) that would suggest Matlab's compiler is a lot better than the Octave compiler. That suggests matlab > octave for image processing, which is among the most difficult tasks in computing w.r.t. time.

Of course, matlab is likely much slower than "real code." I once got a non FFT convolution-based image processing algorithm working in C#, verified results and all, which could do 4K monochromatic images almost fast enough for video (~25ms/frame). It used SIMD+multithreading. If I compiled the code and called the functions through matlab's ability to hook into .dll files, it took over 40s and each thread utilized only about 4% of its core. I am not sure how it is possible for it to perform so badly at that.

If you want speed and don't need to share with these "non programmers" you can always write in C with possibly great pain, or C++ with a bit less pain.

True. Possibly using intrinsics, pragmas and even assembler along with the best compiler (icc for x86).

Member said:
These languages are the "1x [execution] time cost" tier in terms of performance. These days there are a number of newer entrants (rust, golang, scala) that compete with the "old hat" C# for the "4x time cost."

I think that Python/numpy is becoming the language of choice for many dsp-related academic circles.

Member said:
It is often said that the matlab compiler takes advantage of SIMD for vectorized functions - see e.g. http://stackoverflow.com/questions/12615309/how-does-matlab-vectorized-code-work-under-the-hood

You can expect between 4-64x improvement to be due to the ability of the compiler to turn your code into SIMD instructions. Images are usually UInt8s, so I would expect up to 64x from SIMD alone.

Matlab and Octave being very high level languages do not expose any sort of access to SIMD vs non SIMD to the programmer.

I think that SIMD is a sort of wrong alley here. Pure C programs, compiled using a non-vectorizing compiler can often be faster than MATLAB for "real" algorithms (i.e. for-loops of some complexity, meaning that the MATLAB jockey cannot simply call e.g. "conv2" and be done with it).

For the many cases where scalar C can beat MATLAB, SIMD clearly is not necessary to close the gap. MATLAB must be doing other things. Like the (default) choice of 64-bit floating-point for memory and processing. Overhead of function calls. Overhead of loop processing. The inefficiency of having mathematically bright but implementationally inexperienced people expressing their algorithms in a very abstract language that offers few hints about how to do things "optimally". I.e. I think that 90% of the worlds MATLAB code is far slower than it needs to be. And this may be a good thing, if it allows more readable and maintainable code.

Member said:
I do wonder if it would actually be faster to implement expensive image processing algorithms in a fast, parallel+simd language and expose them behind a server. If the requests don't leave the machine they should be served (transfer included) in a few milliseconds. Then you could do these things so very quickly with the same accessibility as matlab/octave, but all the speed benefits of faster, "more programm-y" languages.

Is not a fast GPU just such a server? MATLAB does some (limited) GPU side processing.

-h

hjulenissen · Feb 5, 2017

alanr0 said:
As far as I can tell, both Octave and Matlab are interpreted. There is significant overhead for each executable line of source code added. My experience implementing other code is that the Matlab interpreter is much more highly optimised and more tolerant of code which fails to exploit vectorisation.

Recent versions of Matlab support Just in Time compilation. There is experimental JIT support in Octave, but it is not enabled by default.

See also this:
http://blogs.mathworks.com/loren/2016/02/12/run-code-faster-with-the-new-matlab-execution-engine/

alanr0 said:
alanr0 said:

SIMD type stuff helps. However, not algorithms are inherently able to take advantage from parallelism in general than some other are.

For huge amounts of data read and writes the layout of storage is also important. I think in Matlab, Octave, and perhaps Numpy in Python the matrices are laid out in row major order. This may have an impact for certain operations.

Click to expand...

According to your Wikipedia link, Matlab and Octave both follow their Fortran roots and uses column-major order. Row-major is the default for Numpy (Python).

I'd add that merely considering what precision is needed and what way you traverse your arrays can have a huge impact on execution time. Stuff like conv() performs really fast if it can work on a continous stream of i/o instead of jumping around memory with large strides.

-h

AiryDiscus · Feb 5, 2017

Joofa said:
AiryDiscus said:

Don't get me wrong - I'm not saying the performance benefit is great. it is. I mean to express that if the speedup is 500x,

Click to expand...

For an equivalent comparison at the same resolution JIm K. was operating the speedup factor ratio between Octave and Matlab is 3x - Jim reported about 121 speedup for Matlab and for the same resolution I noticed about ~360x for Octave.

Please see the graph in the link below that shows the time it takes for one of the vectorized filters to operate at various image sizes.

https://www.dpreview.com/forums/post/59077181

Joofa said:

the prior art is written in a way that is difficult for the compiler to understand, or the compiler is not well implemented for speed.

Click to expand...

Yes, that may be true. However, I don't think it is necessarily the code generation part of the compiler. IMHO, it could be the parser part. In fact it appears from the timing graph in the above link that Octave behaves kind of linearly as image size is increased. At least not nonlinearly high as image sizes increase. I would think that perhaps parsing in the interpreter needs to be optimized more.

Many people don't seem to realize the difference in effective parsing methodologies. That is why I'm a big fan of all CS and even EE students taking the compiler course, which unfortunately at many universities is optional. Not for the code generation part per se, but the parsing methodologies that they teach in this course using powerful tools such as bison/flex, etc. With the advent of big data I feel a need for more effort in parsing methodologies for distributed workflows and data pipelines. However, many students don't realize that and in fact like to skip this very useful course as it is considered difficult.

On a different note, at the heart, Octave, like many other open source GPL'd packages, uses standard high performance libraries such as FFTW3, Lapack, etc. Hence, many parts of their code that runs at native speed on the hardware should be fast enough.

Joofa said:

If Matlab is 5x faster than Octave (or so) that would suggest Matlab's compiler is a lot better than the Octave compiler. That suggests matlab > octave for image processing, which is among the most difficult tasks in computing w.r.t. time.

Click to expand...

Matlab parser is perhaps faster. Not necessarily code execution. I would expect a commercial company in business for long time to have decent code generating and optimizing compiler.

However, still not a reason to buy Matlab when Octave is free. The case has to be more compelling.

Joofa said:

Of course, matlab is likely much slower than "real code." I once got a non FFT convolution-based image processing algorithm working in C#, verified results and all, which could do 4K monochromatic images almost fast enough for video (~25ms/frame). It used SIMD+multithreading. If I compiled the code and called the functions through matlab's ability to hook into .dll files, it took over 40s and each thread utilized only about 4% of its core. I am not sure how it is possible for it to perform so badly at that.

Click to expand...

Interesting.

Joofa said:

Years ago I learned Ruby, later Python. Freshmen in optics are required to take a matlab course, so I learned matlab. The language is brilliant for an engineer, as the syntax is fairly "free" and doesn't have much "programmy stuff" like the var keyword, etc.

Click to expand...

These days I'm gravitating towards Python. Many of the parser related slowness we are taking about here due to Matlab / Octave being an interpreter-like environment might go away. And, Python comes with other advantages also. However, an issue with Python is that multi-threading is broken. But, so many young programmers don't even realize what that means for them

And, there are some things in Python just unique to that language that are very interesting for newer, cloud-based, distributed, modern data flow-based pipelines.

Another issue is intellectual property. With Matlab (and also R) the scripts themselves are open to anybody. There are ways to get around that but they are more like hacks. Python compiled code should obviate that issue.

Joofa said:

If you want speed and don't need to share with these "non programmers" you can always write in C with possibly great pain, or C++ with a bit less pain.

Click to expand...

I love C++. But, not enough takers these days. And, QT is absolutely amazing.

Joofa said:

These languages are the "1x [execution] time cost" tier in terms of performance. These days there are a number of newer entrants (rust, golang, scala) that compete with the "old hat" C# for the "4x time cost." Google's V8 javascript compiler is also phenomenally good, and has javascript running in comparable speed to C#, at times even faster.

Click to expand...

Yes, they did that to have so called 'full stack' front-end to back-end Javascript based developers. So that they could do everything end to end in Javascript. But, Javascript is terrible as a language.

Joofa said:

C# has a beautifully simple way to convert for() loops to Parallel.For loops that offers 80%+ the performance of manually managing the threading with nearly zero effort from the programmer. That's a huge win and lets you parallelize damn near everything. I am not so familiar with rust, go, or scala as they are mostly server languages.

Click to expand...

C# is a reasonable language. However, IMHO, Python will eventually kill it. Though, MS Visual Studio has a great IDE. One of the best that I like.

Joofa said:

I do know in Javascript there are now ways to spin off more processes for multithreading, but it is a manual process.

Click to expand...

They are trying. But you can't convert a heyna (Javascript) into a lion (Python).

Joofa said:

Manual multithreading is a pain, and many workloads are very difficult to multithread this way. In the median filter example, you need to send each worker a chunk of the image with some padding (the kernel radius) but only process the chunk without the padding. My eyes are rolling at the thought of writing that code.

It is often said that the matlab compiler takes advantage of SIMD for vectorized functions - see e.g. http://stackoverflow.com/questions/12615309/how-does-matlab-vectorized-code-work-under-the-hood

You can expect between 4-64x improvement to be due to the ability of the compiler to turn your code into SIMD instructions. Images are usually UInt8s, so I would expect up to 64x from SIMD alone.

Matlab and Octave being very high level languages do not expose any sort of access to SIMD vs non SIMD to the programmer.

Click to expand...

What a lot of signal processing people don't realize many times is that the world out there is not alway amenable to parallel code. They are used to embarrassingly parallel algorithms that are easy to cast into parallel code. But, given the issues in a real distributed computing environment it is not always easy to parallelize stuff as one might think.

And, it is not just GPU computing that is blissfully accommodating of embarrassingly parallel code. Many standard workflows in distributed computing frameworks such as Hadoop and Spark are also geared towards such scenarios. Though, for text based inputs Spark is just killing Hadoop (MapReduce, not the Hadoop env such as HDFS, etc.) But that is a different story.

Joofa said:

Assuming you are working 'purely' with variables in RAM, the response time of RAM is in the picosecond domain. It is true that loading millions or billions of bytes can bring this access time up to the millisecond domain, but 1mp images are not large enough that the speed of RAM is likely to be noticeable.

Click to expand...

CPU caching (L1, L2 cache, etc. and their sizes) are also an important issue. That is where data layouts that I mentioned before such as (row/column based) can make a difference for some operations.

Joofa said:

For image processing unless you need floats I would say speed goes GPU >> SIMD+Multithreaded C#/JS/Go/Rust/Scala >> SIMD Matlab > SIMD Octave (?) >> non-SIMD Matlab >> Octave

(>> being ~5x speedup or more)

Click to expand...

Now we have to think about in the totality of distributed systems also that include GPU, etc. as subcomponents in a larger system. World has grown outside computation in a single, isolated environment. See below.

Joofa said:

I do wonder if it would actually be faster to implement expensive image processing algorithms in a fast, parallel+simd language and expose them behind a server.

Click to expand...

Now we are talking This is something I've been doing for some time.

Joofa said:

If the requests don't leave the machine they should be served (transfer included) in a few milliseconds. Then you could do these things so very quickly with the same accessibility as matlab/octave, but all the speed benefits of faster, "more programm-y" languages.

Click to expand...

A lot of secret sauce here. But you are thinking in right direction.

JimKasson · Feb 5, 2017

AiryDiscus said:
Member said:

Click to expand...

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Wouldn't that be Excel?

Member said:
It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Jim

AiryDiscus · Feb 5, 2017

JimKasson said:
AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

Wouldn't that be Excel?

Excel is a data analysis tool, not a programming language

VBA macros might be more popular, but would be very difficult to poll metrics on. Most of them aren't put online anywhere, and if they are are probably hidden behind lock and key

JimKasson said:
JimKasson said:

It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Click to expand...

Jim

--
http://blog.kasson.com

JimKasson · Feb 5, 2017

AiryDiscus said:
JimKasson said:

AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

Wouldn't that be Excel?

Click to expand...

Excel is a data analysis tool, not a programming language

It's got syntax. You can write very complex programs.

I once wrote a complete Monte Carlo financial model for a school, with 20 year projections and lots of what-if's. I did use a plug in from these folks to make the job easier.

It's very easy to write Excel code that runs, but produces wrong answers, and it's often difficult to find the bugs. Another problem is that I don't think that most Excel users think they are programming, and inadequately test their code.

Microsoft's Excel Might Be The Most Dangerous Software On The Planet

No, really, it's possible that Microsoft's Excel is the most dangerous software on the planet. Yes, more dangerous than rogue code running a nuclear power plant, than the Stuxnet that was deliberately sent off to sabotage Iran's nuclear program, worse, even, than whatever rent in the fabric of...

www.forbes.com

Is Scratch a programming language? I think it is.

Jim

Joofa · Feb 5, 2017

AiryDiscus said:
I don't identify as a programmer,

I do. And, not in a hobbyist way. Just to give you a perspective I counted various programming languages (not counting packages, apps, and frameworks) that I have used at different times over several decades of programming as shown below:

Those programming languages that I can remember right now. May be missed some.

This is just to say that while I could be wrong on several aspects of programming, and have my own preferences, but at least the experience gives me some perspective on things.

Member said:
My view is that you should choose the most accessible tool first, then the fastest of what is available. For nonprogrammers, matlab/octave is often the most accessible.

As I said before, I personally would recommend Python. Though I personally use Matlab / Octave for various postings here at DPR. But, that is due to historic reasons. And, also I feel that this forum is more dedicated to Matlab / Octave than Python.

Member said:
If matlab is faster, my view is that you should bite the bullet and pay the $100 or whatever for the personal license.

That perhaps is just the basic price. You might have to pay extra for add on packages. With Octave you get all extra packages free similar to the free basic Octave package.

And, the licensing cost that grows in a commercial setting. Still I can see the advantages of Matlab over Octave. See below.

Member said:
If you are a diehard for FOSS, use octave even though it is slower.

I have a feeling that Matlab is overall faster than Octave. And, I can imaging some commercial work depending absolutely on that speed. So the use of Matlab over Octave in some companies seems to be justified. It is the blanket dismissal of Octave that I find unpalatable.

Member said:
Last I saw, Octave's GUI is bordering on unusable (imo). Matlab's is better, and the code runs faster.

Yes, you are right. Octave GUI is bad. I personally don't use it. I use the CLI version. On Mac that is so nicely integrated with Terminal that some of the Terminal commands can also be used in this CLI.

Member said:
Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

I'm not sure about that. Different people will have different lists.

Member said:
It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Javascript is a poorly designed language. In fact not a proper language in many aspects. Started as a Web Front End stuff and still retains those features. Not intended as a day-to-day programming language.

Member said:
With .net core, the future of C# is very bright. A huge amount of business use C#, it's going to be around for a long time.

I would think for Microsoft's weight behind it, historical reasons, legacy stuff, investment in a lot of code and effort inC#, etc. However, IMHO, going forward I think Python will kill C#.

Member said:
I have a very negative view of distributed computing.

You better get used to it.

May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

--
Dj Joofa
http://www.djjoofa.com

AiryDiscus · Feb 5, 2017

Joofa said:
AiryDiscus said:

I don't identify as a programmer,

Click to expand...

I do. And, not in a hobbyist way. Just to give you a perspective I counted various programming languages (not counting packages, apps, and frameworks) that I have used at different times over several decades of programming as shown below:

Those programming languages that I can remember right now. May be missed some.

This is just to say that while I could be wrong on several aspects of programming, and have my own preferences, but at least the experience gives me some perspective on things.

Joofa said:

My view is that you should choose the most accessible tool first, then the fastest of what is available. For nonprogrammers, matlab/octave is often the most accessible.

Click to expand...

As I said before, I personally would recommend Python. Though I personally use Matlab / Octave for various postings here at DPR. But, that is due to historic reasons. And, also I feel that this forum is more dedicated to Matlab / Octave than Python.

Joofa said:

If matlab is faster, my view is that you should bite the bullet and pay the $100 or whatever for the personal license.

Click to expand...

That perhaps is just the basic price. You might have to pay extra for add on packages. With Octave you get all extra packages free similar to the free basic Octave package.

And, the licensing cost that grows in a commercial setting. Still I can see the advantages of Matlab over Octave. See below.

Joofa said:

If you are a diehard for FOSS, use octave even though it is slower.

Click to expand...

I have a feeling that Matlab is overall faster than Octave. And, I can imaging some commercial work depending absolutely on that speed. So the use of Matlab over Octave in some companies seems to be justified. It is the blanket dismissal of Octave that I find unpalatable.

Commercial software should rarely if ever be made in Matlab/Octave.

Joofa said:
Joofa said:

Last I saw, Octave's GUI is bordering on unusable (imo). Matlab's is better, and the code runs faster.

Click to expand...

Yes, you are right. Octave GUI is bad. I personally don't use it. I use the CLI version. On Mac that is so nicely integrated with Terminal that some of the Terminal commands can also be used in this CLI.

Joofa said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

I'm not sure about that. Different people will have different lists.

#1 by popularity on github, npm is vastly larger than nuget, pip, or redis for ruby gems, the most job postings, etc.

Joofa said:
Joofa said:

It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Click to expand...

Javascript is a poorly designed language. In fact not a proper language in many aspects. Started as a Web Front End stuff and still retains those features. Not intended as a day-to-day programming language.

You have yet to give a reason why it is bad. It has a large feature set, and just like matlab/octave, you can use UInt8s or whatever you want in arrays instead of strings and floats as base types. Javascript can query the DOM, but the DOM is not part of Javascript. Hell you can even do multithreading both in the browser and on the command line, have access to SIMD data types and operations, can work with GPUs...

Joofa said:
Joofa said:

With .net core, the future of C# is very bright. A huge amount of business use C#, it's going to be around for a long time.

Click to expand...

I would think for Microsoft's weight behind it, historical reasons, legacy stuff, investment in a lot of code and effort inC#, etc. However, IMHO, going forward I think Python will kill C#.

Unlikely, C# is trending up on github and Python is trending down in % of all repositories that are mostly language xx.

Joofa said:
Joofa said:

I have a very negative view of distributed computing.

Click to expand...

You better get used to it. May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

But for most applications, it is not necessary. Distributed computing is hugely wasteful and has a terrible impact on the environment. Globally, AWS datacenters run on ~30% coal power. Coal. Distributing tasks that don't need to be distributed kills the world.

Joofa · Feb 5, 2017

AiryDiscus said:
Joofa said:

AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

I'm not sure about that. Different people will have different lists.

Click to expand...

#1 by popularity on github, npm is vastly larger than nuget, pip, or redis for ruby gems, the most job postings, etc.

Thats perhaps due to a large number of front end web gui jobs. That doesn't make it a proper language, IMHO.

Member said:
Member said:

Member said:

It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Click to expand...

Javascript is a poorly designed language. In fact not a proper language in many aspects. Started as a Web Front End stuff and still retains those features. Not intended as a day-to-day programming language.

Click to expand...

You have yet to give a reason why it is bad.

See, in companies there is a title Sales 'Engineer' and then there are real Engineers - those who have studied engineering. Similarly, Javascript was conceived as, and still is, mostly a web-driven front end framework dying to be taken seriously as a general purpose programming language.

Member said:
Member said:

I would think for Microsoft's weight behind it, historical reasons, legacy stuff, investment in a lot of code and effort inC#, etc. However, IMHO, going forward I think Python will kill C#.

Click to expand...

Unlikely, C# is trending up on github and Python is trending down in % of all repositories that are mostly language xx.

Python is perhaps the fastest growing language by acceptance in scientific programming. And, in my personal estimation in future I would see Python + its eco system frameworks taking over C#. This is what I think and is my personal prediction.

I'm not talking about current situation necessarily. Otherwise, what to talk about C# I think currently perhaps there are even more Java programmers out there than Python. But, I see this picture changing.

A lot of CS departments at various universities have recently switched over to teaching Python to early freshmen as a beginner programming course. Wait until a large wave of those people hits job market.

Member said:
Member said:

Member said:

I have a very negative view of distributed computing.

Click to expand...

You better get used to it. May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

Click to expand...

But for most applications, it is not necessary. Distributed computing is hugely wasteful and has a terrible impact on the environment. Globally, AWS datacenters run on ~30% coal power. Coal. Distributing tasks that don't need to be distributed kills the world.

See, I'm not an expert on global environment and that is altogether a different discussion. This is not a pre-Internet era anymore; the world moves on. And, with Internet and more reliable electronic communications, distributed computing is a de facto thing, if not de jure, for whatever reasons.

--
Dj Joofa
http://www.djjoofa.com

AiryDiscus · Feb 5, 2017

Joofa said:
AiryDiscus said:

Joofa said:

AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

I'm not sure about that. Different people will have different lists.

Click to expand...

#1 by popularity on github, npm is vastly larger than nuget, pip, or redis for ruby gems, the most job postings, etc.

Click to expand...

Thats perhaps due to a large number of front end web gui jobs. That doesn't make it a proper language, IMHO.

But it does make a very strong argument that it is the most popular, which was the original point...

Joofa said:
Joofa said:

Joofa said:

Joofa said:

It is also fast, and is very flexible. To me, that makes it a very good language, hardly terrible.

Click to expand...

Javascript is a poorly designed language. In fact not a proper language in many aspects. Started as a Web Front End stuff and still retains those features. Not intended as a day-to-day programming language.

Click to expand...

You have yet to give a reason why it is bad.

Click to expand...

See, in companies there is a title Sales 'Engineer' and then there are real Engineers - those who have studied engineering. Similarly, Javascript was conceived as, and still is, mostly a web-driven front end framework dying to be taken seriously as a general purpose programming language.

You still haven't given a reason why it's bad.

NodeJS allow you to run javascript on the command line if you want, or ChakraJS or any other javascript VM. Node is used to run a huge chunk of the servers for netflix, facebook, and other internet giants. And it ends up being more performant than the prior art for those tasks. Some of that many be attributed to learning how to do it better in round 2, but it stands to reason that if it performs better than the Java (etc) based previous solution, it works as a "general purpose programming language."

Joofa said:
Joofa said:

Joofa said:

I would think for Microsoft's weight behind it, historical reasons, legacy stuff, investment in a lot of code and effort inC#, etc. However, IMHO, going forward I think Python will kill C#.

Click to expand...

Unlikely, C# is trending up on github and Python is trending down in % of all repositories that are mostly language xx.

Click to expand...

Python is perhaps the fastest growing language by acceptance in scientific programming. And, in my personal estimation in future I would see Python + its eco system frameworks taking over C#. This is what I think and is my personal prediction.

I'm not talking about current situation necessarily. Otherwise, what to talk about C# I think currently perhaps there are even more Java programmers out there than Python. But, I see this picture changing.

A lot of CS departments at various universities have recently switched over to teaching Python to early freshmen as a beginner programming course. Wait when a large wave of those people hits job market.

Joofa said:

Joofa said:

Joofa said:

I have a very negative view of distributed computing.

Click to expand...

You better get used to it. May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

Click to expand...

But for most applications, it is not necessary. Distributed computing is hugely wasteful and has a terrible impact on the environment. Globally, AWS datacenters run on ~30% coal power. Coal. Distributing tasks that don't need to be distributed kills the world.

Click to expand...

See, I'm not an expert on global environment and that is altogether a different discussion. This is not a pre-Internet era anymore; the world moves on. And, with Internet and more reliable electronic communications, distributed computing is a de facto thing, if not de jure, for whatever reasons.

--
Dj Joofa
http://www.djjoofa.com

Joofa · Feb 5, 2017

AiryDiscus said:
Joofa said:

AiryDiscus said:

Joofa said:

AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

I'm not sure about that. Different people will have different lists.

Click to expand...

#1 by popularity on github, npm is vastly larger than nuget, pip, or redis for ruby gems, the most job postings, etc.

Click to expand...

Thats perhaps due to a large number of front end web gui jobs. That doesn't make it a proper language, IMHO.

Click to expand...

But it does make a very strong argument that it is the most popular, which was the original point...

No, the original point was that 'Javascript is a terrible programming language', which I said. I did not allude to its popularity. Rather, it being a bad programing 'language'.

Member said:
You still haven't given a reason why it's bad.

See many common things are available in many programming languages. It is how they are implemented that makes them good or bad. Javascript was mostly single threaded, is still in a way in many browsers. An afterthought in way of 'async' and other contrived mechanisms and frameworks for 'parallel' execution was thrown in to give some sort of concurrent execution behavior. But, those things have themselves wreaked havoc. Then on the Web gui front end there are intellectual property issues as it is difficult or contrived to hide the javascript. You can mangle or obfuscate it, but not easily hide it. There are Javascript security vulnerabilities also.

Node has given Javascript some respectability. However, a lot more needs to be done to qualify Javascript as a programming language of choice by serious programmers.

Member said:
Member said:

Member said:

Member said:

Member said:

I have a very negative view of distributed computing.

Click to expand...

You better get used to it. May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

Click to expand...

But for most applications, it is not necessary.

Click to expand...

Click to expand...

Not sure why you think distributed platforms and computing are not necessary. Just stop the internet at company in US and the world comes to a stop!

--
Dj Joofa
http://www.djjoofa.com

AiryDiscus · Feb 5, 2017

Joofa said:
AiryDiscus said:

Joofa said:

AiryDiscus said:

Joofa said:

AiryDiscus said:

Panning javascript as terrible isn't exactly a balanced perspective. Whether you like it or not, it is by an order of magnitude the most popular programming language.

Click to expand...

I'm not sure about that. Different people will have different lists.

Click to expand...

#1 by popularity on github, npm is vastly larger than nuget, pip, or redis for ruby gems, the most job postings, etc.

Click to expand...

Thats perhaps due to a large number of front end web gui jobs. That doesn't make it a proper language, IMHO.

Click to expand...

But it does make a very strong argument that it is the most popular, which was the original point...

Click to expand...

No original point was that 'Javascript is a terrible programming language', which I said. I did not allude to its popularity. Rather, it being a bad programing 'language'.

If you read the first comment, now gray in the quotes, the original point was that it is the most popular.

Joofa said:
Joofa said:

You still haven't given a reason why it's bad.

Click to expand...

See many common things are available in many programming language. It is how they are implemented that makes them good or bad.

Ok. It's around here I'm expecting reasons why implementations of features in JS are bad.

Joofa said:
Javascript was mostly single threaded, is still in a way in many browsers.

This largely isn't bad. Being single threaded is not a feature, but most applications are single-threaded. Webworkers or child processes in node (or chakra, etc) allow you to spin off different things as need be.

When you want to deploy multiple threads on the same data (e.g. an image) JS is not very good. I believe the same is true of most languages, Python included.

Joofa said:
An afterthought in way of 'async' and other contrived mechanisms and frameworks for 'parallel' execution was thrown in to give some sort of concurrent execution behavior. But, those things have themselves wreaked havoc.

Ok. How? NodeJS's non-blocking async IO model is often orders of magnitude faster than sequential IO models for server-type applications. Code usually has a responsiveness on the order of microseconds, IO on the order of milliseconds. Surely you agree that not blocking the thread waiting for IO is a good thing.

Joofa said:
Then on the Web gui front end there are intellectual property issues as you can't hide the javascript. You can mangle or obfuscate it, but not hide it.

You can also decompile binaries. Mangled and obfuscated code is not so far removed from a binary as far as protection goes. I do concede that it is less protected, but the thinking that compiled languages like Java are "safe" this way is untrue.

Joofa said:
There are Javascript security vulnerabilities also.

Sure, but every language has them. If you don't want security vulnerabilities, unplug your device from every other device. Most of the security vulnerabilities of JS in the browser originate in the V8 javascript engine, which is written in... C++.

Joofa said:
Node has given Javascript some respectability. However, a lot needs to be done more to qualify as a programming language of choice by serious programmers.

So Google, Facebook, Netflix, etc, don't employ serious programmers writing user interfaces and servers in Javascript? :-(

Joofa said:
Joofa said:

Joofa said:

Joofa said:

Joofa said:

Joofa said:

I have a very negative view of distributed computing.

Click to expand...

You better get used to it. May be your workflows don't need it. And, that is perfectly fine. However, it is a reality now.

Click to expand...

But for most applications, it is not necessary.

Click to expand...

Click to expand...

Click to expand...

Not sure why you think distributed platforms and computing are not necessary. Just stop the internet at company in US and the world comes to a stop!

--
Dj Joofa
http://www.djjoofa.com

Joofa · Feb 5, 2017

AiryDiscus said:
Joofa said:

Javascript was mostly single threaded, is still in a way in many browsers.

Click to expand...

This largely isn't bad.

Yes, it is. A language, such as Javascript, that has little concept of thread locking, mutexes, semaphore, waits, notify, barriers etc. as in multithreading is hardly a proper, and modern language, IMHO. Now, people have tried to come up with contrived solutions to cater to these features, but the original support was not there from the start.

Member said:
Being single threaded is not a feature, but most applications are single-threaded.

Not sure, how you can claim that?

Member said:
Webworkers or child processes in node (or chakra, etc) allow you to spin off different things as need be.

When you want to deploy multiple threads on the same data (e.g. an image) JS is not very good. I believe the same is true of most languages, Python included.

No. Python has support for mutual exclusion via mutexes, locks, waits, notify, etc. like any proper programming language.

Member said:
Member said:

An afterthought in way of 'async' and other contrived mechanisms and frameworks for 'parallel' execution was thrown in to give some sort of concurrent execution behavior. But, those things have themselves wreaked havoc.

Click to expand...

Ok. How?

As I said, many or most Javascript programmers don't seem to realize that Javascript does not have support for, or a broken one at best, regarding thread locking, mutexes, wait states and notify operations, among other things. These features are backbones of distributed or multithreaded computing. Usual Javascript model is a single threaded event serving mechanism that does not fit with modern multithreaded concepts as just stated.

Member said:
Surely you agree that not blocking the thread waiting for IO is a good thing.

Blocking or non-blocking depends upon what you want to do at hand. There is no this or that preference. Both have utility.

Member said:
Member said:

Then on the Web gui front end there are intellectual property issues as you can't hide the javascript. You can mangle or obfuscate it, but not hide it.

Click to expand...

You can also decompile binaries. Mangled and obfuscated code is not so far removed from a binary as far as protection goes. I do concede that it is less protected, but the thinking that compiled languages like Java are "safe" this way is untrue.

Java can be reverse compiled, like Python. But that is still safer than little protection offered by Javascript.

Member said:
Member said:

There are Javascript security vulnerabilities also.

Click to expand...

Sure, but every language has them. If you don't want security vulnerabilities, unplug your device from every other device. Most of the security vulnerabilities of JS in the browser originate in the V8 javascript engine, which is written in... C++.

I have heard notorious stories about Javascript securities

Member said:
Member said:

Node has given Javascript some respectability. However, a lot needs to be done more to qualify as a programming language of choice by serious programmers.

Click to expand...

So Google, Facebook, Netflix, etc, don't employ serious programmers writing user interfaces and servers in Javascript? :-(

Google, Facebook have tons of different types of programmers. Some of them are web programmers for front end gui. This is where Javascript started at, and still belongs.

--
Dj Joofa
http://www.djjoofa.com

Fast filtering in Octave / Matlab by vectorizing

Senior Member

Senior Member

AiryDiscus

Guest

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

AiryDiscus

Guest

Community Leader

AiryDiscus

Guest

Community Leader

Senior Member

AiryDiscus

Guest

Senior Member

AiryDiscus

Guest

Senior Member

AiryDiscus

Guest

Senior Member

Similar threads

Keyboard shortcuts