<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-5350320546754695211.post2232358627408458586..comments</id><updated>2012-01-15T10:08:44.771-08:00</updated><title type='text'>Comments on Daniel's Software Blog: Fast memcpy in c</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.danielvik.com/feeds/2232358627408458586/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html'/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>18</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-5012300692897152109</id><published>2012-01-15T10:08:44.771-08:00</published><updated>2012-01-15T10:08:44.771-08:00</updated><title type='text'>Thanks for finding the bug. I&amp;#39;ve updated the z...</title><content type='html'>Thanks for finding the bug. I&amp;#39;ve updated the zip to reflect the change.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5012300692897152109'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5012300692897152109'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1326650924771#c5012300692897152109' title=''/><link rel='related' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7815089520864602491'/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-7815089520864602491</id><published>2011-12-28T13:50:17.058-08:00</published><updated>2011-12-28T13:50:17.058-08:00</updated><title type='text'>There is a (mostly benign) bug in the code.

#if T...</title><content type='html'>There is a (mostly benign) bug in the code.&lt;br /&gt;&lt;br /&gt;#if TYPE_WIDTH &amp;gt;= 4&lt;br /&gt;&lt;br /&gt;The code in the #if block will always be included. That should be &amp;quot;&amp;gt;&amp;quot;, not &amp;quot;&amp;gt;=&amp;quot;.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7815089520864602491'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7815089520864602491'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1325109017058#c7815089520864602491' title=''/><author><name>id</name><uri>https://www.google.com/accounts/o8/id?id=AItOawn6nafYVyJ4OxqMQVOLZ5PK49FzEyrJTDw</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/openid16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-727902580'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-935626540610529788</id><published>2011-01-28T12:20:47.070-08:00</published><updated>2011-01-28T12:20:47.070-08:00</updated><title type='text'>I talked with my c standard guru friends and the b...</title><content type='html'>I talked with my c standard guru friends and the behavior is indeed undefined, but most architectures and compilers does behave as the snipplet intends to work. &lt;br /&gt;&lt;br /&gt;From a portability point of view, its not a concern though. The full memcpy implementation contain three different implementations:&lt;br /&gt;&lt;br /&gt;1. Post increment&lt;br /&gt;2. Pre increment&lt;br /&gt;3. Indexed copy&lt;br /&gt;&lt;br /&gt;The undefined behavior only applies to option 2, and the others should be fine to use on architecture that doesn&amp;#39;t have the more common implementation of the undefined behavior.&lt;br /&gt;&lt;br /&gt;Thanks for pointing it out.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/935626540610529788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/935626540610529788'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1296246047070#c935626540610529788' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-6849006998101584423</id><published>2011-01-27T20:34:28.912-08:00</published><updated>2011-01-27T20:34:28.912-08:00</updated><title type='text'>No, it does not state that Q-1==P when Q is outsid...</title><content type='html'>No, it does not state that Q-1==P when Q is outside the boundary of the array. It states: &amp;quot;If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.&amp;quot;&lt;br /&gt;&lt;br /&gt;There is a provision for t+n, but none for t-1 (t being an array of size n).&lt;br /&gt;&lt;br /&gt;I gave the number of the paragraph earlier. Look it up: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6849006998101584423'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6849006998101584423'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1296189268912#c6849006998101584423' title=''/><author><name>Pascal</name><uri>http://www.blogger.com/profile/16863430576950446222</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1175727680'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-7284309355010200864</id><published>2011-01-27T12:31:26.351-08:00</published><updated>2011-01-27T12:31:26.351-08:00</updated><title type='text'>Adding an integer to a pointer does I believe not ...</title><content type='html'>Adding an integer to a pointer does I believe not cause undefined behavior if there are no overflows in the operation. The rule states that if Q=P+1, then Q-1==P, also if Q is outside the boundary of the array.&lt;br /&gt;However an indirection of Q (e.g. v = *Q) will cause memory access violation if the memory is not accessible.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7284309355010200864'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7284309355010200864'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1296160286351#c7284309355010200864' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-3638122458580232762</id><published>2011-01-27T10:26:48.474-08:00</published><updated>2011-01-27T10:26:48.474-08:00</updated><title type='text'>Technically, --src8; and --dst8; invoke undefined ...</title><content type='html'>Technically, --src8; and --dst8; invoke undefined behavior when src and dest point to the beginning of their respective blocks. See 6.5.6.8 in the C99 standard. And indeed some segmented architectures will trap on these instructions.&lt;br /&gt;&lt;br /&gt;So it&amp;#39;s not portable in the sense that it works when compiled by compliant compilers on these architectures.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/3638122458580232762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/3638122458580232762'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1296152808474#c3638122458580232762' title=''/><author><name>Pascal</name><uri>http://www.blogger.com/profile/16863430576950446222</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1175727680'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-3600160023853417739</id><published>2011-01-12T09:37:27.141-08:00</published><updated>2011-01-12T09:37:27.141-08:00</updated><title type='text'>Indeed. Typically memcmp doesn&amp;#39;t. The standard...</title><content type='html'>Indeed. Typically memcmp doesn&amp;#39;t. The standard function that do is memmove. Its probably easy to add a test to check whether the overlap cause problems with the algorithm. Since the algorithm uses 32/64 bit blocks, the special cases are dependent on alignment, but shouldn&amp;#39;t be hard to add in order to get the characteristics of memmove</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/3600160023853417739'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/3600160023853417739'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1294853847141#c3600160023853417739' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-2918118460691398382</id><published>2011-01-10T21:38:18.718-08:00</published><updated>2011-01-10T21:38:18.718-08:00</updated><title type='text'>Your code does not account for cases when the memo...</title><content type='html'>Your code does not account for cases when the memory region pointed by the source overlap the memory region pointed by the destination.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/2918118460691398382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/2918118460691398382'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1294724298718#c2918118460691398382' title=''/><author><name>ftwilliam</name><uri>http://www.blogger.com/profile/09781226946873418738</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1151967191'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-5985215114489888936</id><published>2010-12-29T18:18:31.112-08:00</published><updated>2010-12-29T18:18:31.112-08:00</updated><title type='text'>its my understanding that things like the glibc an...</title><content type='html'>its my understanding that things like the glibc and related libs  are not using any worthwhile SIMD optimised instructions  at this time for several routines, perhaps you should investigate and try some options.&lt;br /&gt;&lt;br /&gt;see:&lt;br /&gt;http://www.freevec.org/content/commentsconclusions&lt;br /&gt;&lt;br /&gt;&amp;quot;... Finally, with regard to glibc performance, even if we take into account that some common routines are optimised (like strlen(), memcpy(), memcmp() plus some more), most string functions are NOT optimised. Not only that, glibc only includes reference implementations that perform the operations one-byte-at-a-time! How&amp;#39;s that for inefficient? We&amp;#39;re not talking about dummy unused joke functions here like memfrob(), but really important string and memory functions that are used pretty much everywhere, like strcmp(), strncmp(), strncpy(), etc.&lt;br /&gt;In times where power consumption has become so much important, I would think that the first thing to do to save power is optimise the software, and what better place to start than the core parts of an operating system? I can&amp;#39;t speak for the kernel -though I&amp;#39;m sure it&amp;#39;s very optimised actually- but having looked at the glibc code extensively the past years, I can say that it&amp;#39;s grossly unoptimised, so much it hurts.&amp;quot;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5985215114489888936'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5985215114489888936'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1293675511112#c5985215114489888936' title=''/><author><name>pip99</name><uri>http://pip99.livejournal.com/</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/openid16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-622099489'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-407124041446399791</id><published>2010-12-11T21:17:30.380-08:00</published><updated>2010-12-11T21:17:30.380-08:00</updated><title type='text'>I am a beginner who has not quite understand with ...</title><content type='html'>I am a beginner who has not quite understand with this. just greetings from tyang @ &lt;a href="http://codeprogram.blogspot.com" rel="nofollow"&gt;codeprogram.blogspot.com&lt;/a&gt; just a beginner programer&amp;#39;s blog</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/407124041446399791'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/407124041446399791'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1292131050380#c407124041446399791' title=''/><author><name>tyang</name><uri>http://www.blogger.com/profile/07312418170280606544</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='06544866633445357372'/><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='33' height='26' src='http://1.bp.blogspot.com/_CU-p3-k5y5A/TPjDMJR-rbI/AAAAAAAAAAM/R2CIsw-yLuY/S220/pic.php.jpeg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-117543958'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-7935027275490338282</id><published>2010-12-03T18:00:28.623-08:00</published><updated>2010-12-03T18:00:28.623-08:00</updated><title type='text'>Sorry I don&amp;#39;t understand . Isn&amp;#39;t it true t...</title><content type='html'>Sorry I don&amp;#39;t understand . Isn&amp;#39;t it true that &amp;quot;memcmp&amp;quot; and &amp;quot;memcpy&amp;quot; are implemented using only 1 x86 instruction? (Only one x86 machine language is executed to perform the whole copy in one shot)&lt;br /&gt;In that case, how these functions can be optimized by any algorithm using For Loops?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7935027275490338282'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/7935027275490338282'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1291428028623#c7935027275490338282' title=''/><author><name>Peyman</name><uri>http://www.blogger.com/profile/03201543629947522031</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-311774560'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-6931641601737625377</id><published>2010-08-13T22:25:12.384-07:00</published><updated>2010-08-13T22:25:12.384-07:00</updated><title type='text'>Its probably more common with data being aligned a...</title><content type='html'>Its probably more common with data being aligned and nowadays built-in memcpy implementations are usually faster for these cases. If you have a mix of unaligned and aligned copies and speed is important, you can always use the built-in memcpy for aligned copies and this one for unaligned. Otherwise I would stick with the built-in. Processors nowadays are so fast so time spent in memcpy isn&amp;#39;t a big issue, but there are cases when it matters and then its good with options...</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6931641601737625377'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6931641601737625377'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1281763512384#c6931641601737625377' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-5047755800880506373</id><published>2010-08-12T18:40:26.274-07:00</published><updated>2010-08-12T18:40:26.274-07:00</updated><title type='text'>I&amp;#39;ve been testing your memcpy() on an embedded...</title><content type='html'>I&amp;#39;ve been testing your memcpy() on an embedded system, Arm7 under the iar compiler.&lt;br /&gt;&lt;br /&gt;I tried several configurations of the memcpy(), with or without indexes and pre incremented pointers and I&amp;#39;m only getting about 65% of the performance of the compiler built-in version. The performance of copying aligned to unaligned or unaligned to aligned buffers is much quicker with your memcpy() however, about 2x faster.&lt;br /&gt;&lt;br /&gt;Chris</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5047755800880506373'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/5047755800880506373'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1281663626274#c5047755800880506373' title=''/><author><name>cmorgan</name><uri>http://www.blogger.com/profile/00505774502464208534</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-136213656'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-4108893419246880666</id><published>2010-08-12T18:29:59.408-07:00</published><updated>2010-08-12T18:29:59.408-07:00</updated><title type='text'>Just wanted to let you know that I&amp;#39;m seeing co...</title><content type='html'>Just wanted to let you know that I&amp;#39;m seeing compiler warnings about unreachable statements from the &amp;#39;break;&amp;#39; statements after the COPY_XXX() macros because the macro is returning.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/4108893419246880666'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/4108893419246880666'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1281662999408#c4108893419246880666' title=''/><author><name>cmorgan</name><uri>http://www.blogger.com/profile/00505774502464208534</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-136213656'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-6832607874336299320</id><published>2010-07-05T21:49:19.693-07:00</published><updated>2010-07-05T21:49:19.693-07:00</updated><title type='text'>I think you are using it correct. Not sure what op...</title><content type='html'>I think you are using it correct. Not sure what optimization you turned on, but I don&amp;#39;t think you&amp;#39;ll beat the standard memcpy when copying aligned data on a standard processor. Nowadays, most standard library memcpy&amp;#39;s are pretty good, especially on established processors.&lt;br /&gt;&lt;br /&gt;This implementation is mainly beneficial if you are running a new embedded processor and sometimes on other systems (like initial versions of clib for PSP).&lt;br /&gt;&lt;br /&gt;Occasionally this implementation works better than standard implementations on unaligned data, even on more established processors, but I wouldn&amp;#39;t bother testing it on e.g. x86-64 unless speed is really a big problem.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6832607874336299320'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6832607874336299320'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1278391759693#c6832607874336299320' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-2893910924899822</id><published>2010-06-16T17:30:08.694-07:00</published><updated>2010-06-16T17:30:08.694-07:00</updated><title type='text'>I tried this as-is in a c++ project compiled with ...</title><content type='html'>I tried this as-is in a c++ project compiled with Xcode for x86_64 on a Mac Mini with the following code:&lt;br /&gt;&lt;br /&gt;  int copy1[50];&lt;br /&gt;  int copy2[50];&lt;br /&gt;  for (int i = 0; i &amp;lt; 500000000; ++i)&lt;br /&gt;  {&lt;br /&gt;    memcpy(copy1, copy2, 50 * sizeof(int));&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;The standard memcpy took about 7 seconds, the new memcpy took about 14 seconds.  Am I using it in the wrong environment, or with the wrong data set to take advantage of its speed?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/2893910924899822'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/2893910924899822'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1276734608694#c2893910924899822' title=''/><author><name>PaulLuigi</name><uri>http://www.blogger.com/profile/12114481298557791021</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-522794876'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-6677436824097603610</id><published>2010-04-23T23:12:00.946-07:00</published><updated>2010-04-23T23:12:00.946-07:00</updated><title type='text'>I actually used early on, but Duff&amp;#39;s device on...</title><content type='html'>I actually used early on, but Duff&amp;#39;s device only works well for pre and post increment. Most targets seem to work best with indexed copy, and then you can&amp;#39;t use Duff&amp;#39;s device like presented on the wiki. It is easy though to replace the while (length &amp;amp; 7) in COPY_SHIFT with a switch statement, but iirc, it didn&amp;#39;t really give a performance boost. But such switch statement isn&amp;#39;t a DUff&amp;#39;s device since it doesn&amp;#39;t have the while loop, but the idea is similar.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6677436824097603610'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/6677436824097603610'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1272089520946#c6677436824097603610' title=''/><author><name>Daniel Vik</name><uri>http://www.blogger.com/profile/13059236177797348097</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_T9_5SMKEUXk/S3WNGIMojjI/AAAAAAAAAAM/MbmYniFDOv8/S220/danielvik.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-719728608'/></entry><entry><id>tag:blogger.com,1999:blog-5350320546754695211.post-9014836145389116785</id><published>2010-03-20T01:21:21.536-07:00</published><updated>2010-03-20T01:21:21.536-07:00</updated><title type='text'>I&amp;#39;m amazed that you haven&amp;#39;t mentioned Duff...</title><content type='html'>I&amp;#39;m amazed that you haven&amp;#39;t mentioned Duff&amp;#39;s device anywhere. Am I missing something? http://en.wikipedia.org/wiki/Duff%27s_device</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/9014836145389116785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5350320546754695211/2232358627408458586/comments/default/9014836145389116785'/><link rel='alternate' type='text/html' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html?showComment=1269073281536#c9014836145389116785' title=''/><author><name>Mike Axiak</name><uri>http://www.blogger.com/profile/00191075452773889438</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.danielvik.com/2010/02/fast-memcpy-in-c.html' ref='tag:blogger.com,1999:blog-5350320546754695211.post-2232358627408458586' source='http://www.blogger.com/feeds/5350320546754695211/posts/default/2232358627408458586' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-246681687'/></entry></feed>
