Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
resarch:nlpa:paper_3 [2014/10/16 11:15] preethac |
resarch:nlpa:paper_3 [2014/10/16 13:00] (current) preethac |
||
---|---|---|---|
Line 8: | Line 8: | ||
** Problem:** | ** Problem:** | ||
- | Filtering of large unstructured data from developer emails and bug tracking systems. | + | \\ Absence of comments/descriptions for methods. |
**Importance/Applications of the technique:** | **Importance/Applications of the technique:** | ||
- | Automatic mining of source code descriptions from bug tracking systems and mailing lists. | + | \\ Automatic mining of source code descriptions from bug tracking systems and mailing lists. |
- | Source code re-documentation. | + | \\ Source code re-documentation. |
**Approach:** | **Approach:** | ||
+ | \\ 1. Downloading emails and tracing them onto classes- using 2 heuristics- if email contains fully qualified class name or file name. For bug ids-matching bug id to closing comments. | ||
+ | \\ 2. Extracting paragraphs based on programming language keyword/operator density. | ||
+ | \\ 3. Tracing paragraphs onto method based on occurrence of keyword "method" and a method name followed by parenthesis. | ||
+ | \\ 4. Filtering paragraphs further by - return types, over-ridding/overloading and method invocations(invoking a method inside another method). | ||
+ | \\ 5. Computing textual similarities between paragraphs and methods and ranking them based on similarity measure. | ||
- | |||
- | ** | ||
- | problematic API features** | ||
** | ** |