Device Driver Development [Part 4]

Device Driver Development [Part 4]

By Dave Jones (
In the previous installments of this article, there have been some inaccuracies, workarounds, ugly hacks, and other code which should never have been written. Fortunately, some readers have pointed this out to me, so I can learn, as can you.

The code I’ll base this article on is similar in functionality to that shown in the second installment of this article. This has been done so that you can compare the differences between them, and hopefully learn what not to do.

The first big mistake in the earlier article, was that I was dereferencing a pointer to user space from within kernel space. This is not allowed, as the kernel will not know which segment the offset refers to, and use the current one instead. It’s remarkable that this code ever worked! The corrected code should use the copy_from_user() function instead of a loop dereferencing to user space. The TEST_write_proc subroutine now gets a lot shorter, as we do away with the memory copying loop. All we have to do, is null terminate the string, and then copy_from_user() it.

 static int TEST_write_proc( struct file *file, const char *buffer, unsigned long count, void *data) {  	if (!capable(CAP_SYS_ADMIN)) 		return -EACCES;  	/* We must have room to null-terminate the string. */ 	if (count>=(BUFFER_SIZE-1)) 		return -EINVAL;  	*(kernel_buffer_ptr+count) ='';  	copy_from_user(kernel_buffer_ptr, buffer, count);  	return(count); } 

The next mistake was quite a subtle one. In some cases, the routine to which the kernel uses to put information to userspace can overflow. The author of the original code was I believe aware of the problem, but didn’t fit any safety measures into the code. It takes the stance that “You should know what you’re feeding this routine”.
Douglas Gilbert wrote the following mail to me, which sums it up perfectly..

The way that the PRINT_PROC macro works will always cause the following to be true: *begin <= offset . Now when the data being output is less than 3072 bytes (which is the 'size' that Linux always seems to prefer (in 2.3.21)) then: 0 == *begin == offset so there is no problem. However if that amount of output is exceeded then *begin < offset > 0 (most of the time) and the brown stuff hits the wall in TEST_read_proc() due to:  *start = buffer + (begin - offset);  This leads to *start being an address _before_ buffer and ....  I watched the proc_read side with debug placed inside the kernel and it initially specifies a size of 3072 bytes while it actually allocates memory for a page (4096 bytes on i386).  Now the logic seems to be here (assuming you are using sprinf() to output to the buffer) if a later line overflows the 3072 count then it won't do it by more than 1024 bytes (otherwise it would corrupt kernel memory).  The way the logic works, the following call by the kernel to proc_read gets the driver to throw away all _complete_ lines it put in the first call (via the 'offset' argument). So it starts output with the first incomplete line followed by the previously unprocessed lines. [The 'start' argument tells proc_fs how to to glue the 2 buffers it got back from its proc_read() calls together. This is were the bug was.]  So I found the bug by having lots of data to output. Try it and watch the kernel errors. You are correct in saying this can not easily be exploited by the users. After increasing the contents array size, try: # cat text_file_over_3072_bytes_long > /proc/testdir/test # cat /proc/testdir/test  and see what happens.  There is still a problem that must be left up to the driver to fix. If this technique is used to output large amounts of rapidly changing state (e.g. debug from a stress tested device driver) then there may have been a change of driver state between the first proc_read() call that was incomplete and the follow up one(s). No kernel memory corruption results from this but the output the user sees can have some "discontinuities" in it. Fixing this would require the driver holding a large buffer or using some locks (the latter idea is not a good one). 

The corrected TEST_read_proc() then looks like this..

 static int TEST_read_proc( char *buffer, char **start, off_t offset, 			    int size, int *eof, void *data ) { 	int len = 0; 	off_t begin = 0;  	*eof = proc_infos( contents, buffer, &len, &begin, offset, size );  	if (offset >= begin + len) 		return( 0 ); 	*start = buffer + ((begin  < offset ) ? (offset - begin) : (begin - offset)); 	return( size < begin + len - offset ? size : begin + len - offset ); } 

The initialization routine also had some things wrong with it. Nothing major, but some bits that could have been a lot nicer.

  1. Use of __init as opposed to the older __initfunc
  2. Testing return codes for failure, and logging error if something goes wrong
  3. Removes the directory if file create fails.

The corrected init routine should look like this…

int __init TEST_init(void) { 	printk(KERN_INFO ["TEST driver v%s ", TEST_VERSION );  	kernel_buffer_ptr=contents;  	proc_TESTDIR = create_proc_entry( "testdir", S_IFDIR, 0 ); 	if (proc_TESTDIR==0) { 		printk(KERN_INFO "Couldn't create testdir "); 		return (-1); 	}  	if ((proc_TEST = create_proc_entry( "test", 0, proc_TESTDIR ))) { 		proc_TEST->read_proc = TEST_read_proc; 		proc_TEST->write_proc = TEST_write_proc; 	} else { 		printk(KERN_INFO "Couldn't create test "); 		remove_proc_entry ( "testdir", 0 ); 		return (-1); 	}  	return( 0 ); } 

I fell into quite a few pitfalls whilst writing this originally. The biggest problem was possibly that the code was heavily based on existing code, which I was then pushing to do things it was never designed to do. At least one of the bugs above could have been avoided if the code I had based it on had comments warning anyone who may chance upon that code at a later date that things may go pear shaped if you don’t take the precautions.

I’m indebted to Jeff Garzik, and Douglas Gilbert for showing me the shortfallings of the previous articles, and for enlightening me with some worthwhile criticism. I’m learning kernel hacking as I go along, and hopefully with enough help from people like these, I’ll learn a helluva lot faster, thanks guys.

댓글 남기기

아래 항목을 채우거나 오른쪽 아이콘 중 하나를 클릭하여 로그 인 하세요: 로고

WordPress.com의 계정을 사용하여 댓글을 남깁니다. 로그아웃 / 변경 )

Twitter 사진

Twitter의 계정을 사용하여 댓글을 남깁니다. 로그아웃 / 변경 )

Facebook 사진

Facebook의 계정을 사용하여 댓글을 남깁니다. 로그아웃 / 변경 )

Google+ photo

Google+의 계정을 사용하여 댓글을 남깁니다. 로그아웃 / 변경 )

%s에 연결하는 중

%d 블로거가 이것을 좋아합니다: