Lexicographically sorting large files in Linux

When I hear the word “sort” my first thought is usually “Hadoop”! Yes, sorting is one thing that Hadoop does well, but if you’re working with large files in Linux the built-in sort command is often all you need.

Let’s say you have a large file on a host with 2GB or more of main memory free. The following sort command is a efficient way to lexicographically-order large files.

LC_COLLATE=C sort --buffer-size=1G --temporary-directory=./tmp --unique bigfile.txt

Let’s break this command down and examine each part in detail.


( ! ) Warning: count(): Parameter must be an array or an object that implements Countable in /var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php on line 169
Call Stack
#TimeMemoryFunctionLocation
10.0008412280{main}( ).../index.php:0
20.08244262248Joomla\CMS\Application\SiteApplication->execute( ).../index.php:49
30.08244262248Joomla\CMS\Application\SiteApplication->doExecute( ).../CMSApplication.php:196
40.377111453136Joomla\CMS\Application\SiteApplication->dispatch( ).../SiteApplication.php:233
50.378011477464Joomla\CMS\Component\ComponentHelper::renderComponent( ).../SiteApplication.php:194
60.379111532792Joomla\CMS\Component\ComponentHelper::executeComponent( ).../ComponentHelper.php:377
70.379411560192require_once( '/var/www/vhosts/shan.info/httpdocs/components/com_k2/k2.php' ).../ComponentHelper.php:402
80.392611959824K2ControllerItem->execute( ).../k2.php:64
90.392611959824K2ControllerItem->display( ).../BaseController.php:710
100.435812610528K2ControllerItem->display( ).../item.php:78
110.435812610528K2ControllerItem->display( ).../controller.php:19
120.446612981568Joomla\CMS\Cache\Controller\ViewController->get( ).../BaseController.php:663
130.449713001936K2ViewItem->display( ).../ViewController.php:102
140.576915839472K2ViewItem->display( ).../view.html.php:742
150.576915839472K2ViewItem->loadTemplate( ).../HtmlView.php:230
160.580816012408include( '/var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php' ).../HtmlView.php:701

( ! ) Notice: Only variables should be assigned by reference in /var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php on line 478
Call Stack
#TimeMemoryFunctionLocation
10.0008412280{main}( ).../index.php:0
20.08244262248Joomla\CMS\Application\SiteApplication->execute( ).../index.php:49
30.08244262248Joomla\CMS\Application\SiteApplication->doExecute( ).../CMSApplication.php:196
40.377111453136Joomla\CMS\Application\SiteApplication->dispatch( ).../SiteApplication.php:233
50.378011477464Joomla\CMS\Component\ComponentHelper::renderComponent( ).../SiteApplication.php:194
60.379111532792Joomla\CMS\Component\ComponentHelper::executeComponent( ).../ComponentHelper.php:377
70.379411560192require_once( '/var/www/vhosts/shan.info/httpdocs/components/com_k2/k2.php' ).../ComponentHelper.php:402
80.392611959824K2ControllerItem->execute( ).../k2.php:64
90.392611959824K2ControllerItem->display( ).../BaseController.php:710
100.435812610528K2ControllerItem->display( ).../item.php:78
110.435812610528K2ControllerItem->display( ).../controller.php:19
120.446612981568Joomla\CMS\Cache\Controller\ViewController->get( ).../BaseController.php:663
130.449713001936K2ViewItem->display( ).../ViewController.php:102
140.576915839472K2ViewItem->display( ).../view.html.php:742
150.576915839472K2ViewItem->loadTemplate( ).../HtmlView.php:230
160.580816012408include( '/var/www/vhosts/shan.info/httpdocs/templates/gk_publisher/html/com_k2/templates/default/item.php' ).../HtmlView.php:701
back to top