Hitech Solutions

SINCE 2004

home
login

0 Item in Bag

Your Shopping bag is empty

VIEW/EDIT BAG

CHECKOUT

Notice

ALL COMPUTER, ELECTRONICS AND MECHANICAL COURSES AVAILABLEâ€¦. PROJECT GUIDANCE SINCE 2004. FOR FURTHER DETAILS CALL 9443117328

Projects > COMPUTER > 2017 > NON IEEE > APPLICATION

Clustering Sentence Level Text Using Novel Fuzzy

Abstract

In this paper, we have proposed a novel solution for collaborative management of shared data in OSNs (Online social networks). An MPAC model was formulated, along with a multiparty policy specification scheme and corresponding policy evaluation mechanism. This is important in domains such as sentence clustering, since a sentence is likely to be related to more than one theme or topic present within a document or set of documents we propose an approach to enable the protection of shared data associated with multiple users in OSNs. We formulate an access control model to capture the essence of multiparty authorization requirements, along with a multiparty policy specification scheme and a policy enforcement mechanism. An obvious potential application of the algorithm is to document summarization; however, the algorithm can also be used within more general text mining settings such as query-directed text mining. This is in stark contrast to k-Means and Gaussian mixture approaches, which tend to be highly sensitive to initialization. Second, the algorithm appears to be able to converge to an appropriate number of clusters, even if the number of initial clusters was set very high.

Existing System

Clustering text at the document level is well established in the Information Retrieval (IR) literature, where documents are typically represented as data points in a high dimensional vector space in which each dimension corresponds to a unique keyword, leading to a rectangular representation in which rows represent documents and columns represent attributes of those documents (e.g., tf-idf values of the keywords). The vector space model has been successful in IR because it is able to adequately capture much of the semantic content of document-level text. This is because documents that are semantically related are likely to contain many words in common, and thus are found to be similar according to popular vector space measures such as cosine similarity, which are based on word co-occurrence. However, while the assumption that (semantic) similarity can be measured in terms of word co-occurrence may be valid at the document level, the assumption does not hold for small-sized text fragments such as sentences, since two sentences may be semantically related despite having few, if any, words in common.

Proposed System

This paper presents a novel fuzzy clustering algorithm that operates on relational input data; i.e., data in the form of a square matrix of pairwise similarities between data objects. The algorithm uses a graph representation of the data, and operates in an Expectation-Maximization framework in which the graph centrality of an object in the graph is interpreted as a likelihood. Results of applying the algorithm to sentence clustering tasks demonstrate that the algorithm is capable of identifying overlapping clusters of semantically related sentences, and that it is therefore of potential use in a variety of text mining tasks.

Architecture

goto projects

FOR MORE INFORMATION CLICK HERE