Re: Can I set a larger HDFS block size, like 4 or 8 GB in production environment? What is the problem with large blocks?
In theory, it should support
1. It may take long time to replicate in case any of the replica is lost/moved due to balancer/mover/replication
2. In case of pipeline recoveries during write/append, if new node is replaced the failed node, then existing data will be copied to new datanode. This may take long time based on written size (in this case GBs).
If this transfer didn’t complete within timeout(default 60s) client may get timeout, and write may fail.
3. Balancer may get timeout while moving blocks from one datanode to another for balancing considering the size.
(remark : the reply is from HDFS PMC Vinayakumar )