Retail Management

A process of promoting greater sales and customer satisfaction by gaining a better understanding of the consumers of goods and services produced by a company. A typical retail management strategy for a manufacturing business might research the retail process that distributes the finished products created by the business to consumers to determine and satisfy what buyers want and require.

Intrusion Detection System Research Based on Data Mining for IPv6

IPv6 will inevitably take the place of IPv4 as the next generation of the Internet Protocol. Despite IPv6 has better security than IPv4, but there still have some security issues. So it is an urgent problem to requirement of IDS for IPv6 networks.
Many intelligent information processing methods, data mining technology and so on have been applied to improve detection accuracy for IPv4 network. At first IPv6 security issues has been analysis in this project, secondly discussed the IPv6 intrusion detection model, then we in accordance with such a model present an intrusion detection model realization for IPv6 network, and we propose a strategy for the system achieve and optimization. The system can work well for intrusion detection for IPv6 network.

Network Intrusion Detection:            
         Modern computer networks must be equipped with appropriate security mechanisms in order to protect the information resources maintained by them. Intrusion detection systems (IDSs) are integral parts of any well configured and managed computer network systems. An IDS is a combination of software and hardware components, capable of monitoring different activities in a network and analyze them for signs of security threats. There are two major approaches to intrusion detection: anomaly detection and misuse detection. Misuse detection uses patterns of well known intrusions to match and identify unlabeled data sets. In fact, many commercial and open source intrusion detection systems are misuse based. Anomaly detection, on the other hand, consists of building models from normal data which can be used to detect variations in the observed data from the normal model. The advantage with anomaly detection algorithms is that they can detect new forms of attacks which might deviate from the normal behaviour . In this project, various supervised learning algorithms, particularly decision trees based on ID3, J48, and Naïve Bayes algorithms are explored for network intrusion.Intrusion detection is the art of detecting the break-ins of malicious attackers. Today, computer security has grown in importance with the widespread use of the Internet. Firewalls are commonly used to prevent attacks from occurring. Antivirus and anti-spyware programs can help people to remove already existing automated attacks from their computer. Access control limits physical and networked use of a computer. However, an important component of setting up a secure system is to have some way to analyze the activity on the computer and determine whether an attack has been launched against the computer. Such a system is called an intrusion detection system. This project uses Naive Bayes, a Decision Tree algorithms to determine the relative strengths and weaknesses of using these approaches. The purpose is to give an evaluation of the performance of these algorithms that will allow someone who wishes to use one of these approaches to understand how accurate the approach is and under what conditions it works well. In addition, a novel evaluation technique will be considered. Accuracy can be evaluated effectively by using Receiver Operating Characteristic (ROC) curves. Cost curves  can indicate the conditions under which the algorithm works well.
           A requirement is a feature that the system must have or a constraint that it must satisfy to be accepted by client. Requirements engineering aims at defining the requirements for the system under construction. It includes two main activities: Requirements Elicitation and Analysis.
         Requirements elicitation is about communication among developers, clients, and users for defining a new system. It focuses on describing the purpose of the system. Such a definition is called system  specification. Requirement elicitation is the more challenging of the two because it requires the collaboration of several groups of participants with different backgrounds. On the one hand, the client and the users are experts in their domain and have a general idea of what the system should do, but they often have little experience in software development. On the other hand, the developers have experience in building systems, but often have  little knowledge of everyday environment of the users

Intrusion Detection and Attack Classification Using Feed-Forward Neural Network

THE rapid development and expansion of World Wide Web and local network systems have changed the computing world in the last decade. The highly connected computing world has also equipped the intruders and hackers with new facilities for their
destructive purposes. The costs of temporary or permanent damages caused by unauthorized access of the intruders to computer systems have urged different organizations to increasingly implement various systems to monitor data flow in their networks These systems are generally referred to as Intrusion Detection Systems (IDSs).
Network security is becoming an issue of paramount importance in the information technology era The survey conducted in Australia reveals that while 98% of organizations experienced some form of broader computer crime or abuse, 67% suffered a computer security incident National and international infrastructure is heavily network based across all sectors. As we increasingly rely on information infrastructures to support critical operations in defense, banking, telecommunication, transportation, electric power, e-governance, and many other systems, intrusions into information systems have become a significant threat to our society with potentially severe consequences .
 An intrusion compromises the security (e.g. availability, integrity, and confidentiality) of an information system through various means. Computer systems have become so large, complex, and have assumed many important tasks that when things go wrong, it is extremely difficult to implement fixes fast enough to avoid mission critical problems. The fast growing data transfer rate, proliferation of networks, and the Internet’s unpredictability have added even more problems. Researchers are working hard to develop more efficient, reliable and self-monitoring systems, which detect problems and continue to operate fixing without human interaction. This type of approach tries to reduce catastrophic failures of sensitive systems.
There are two main approaches to the design of IDSs. In a misuse detection based IDS, intrusions are detected by looking for activities that correspond to known signatures of intrusions or vulnerabilities. On the other hand, an anomaly detection based IDS detects intrusions by searching for abnormal network traffic. The abnormal  traffic pattern can be defined either as the violation of accepted thresholds for frequency of events in a connection or as a user’s violation of the legitimate profile developed for his/her normal behavior. One of the most commonly used approaches in expert system based intrusion detection systems is rule-based analysis using  profile model. Rule-based
analysis relies on sets of predefined rules that are provided by an administrator or created by the system. Unfortunately, expert systems require frequent updates to remain current. This design approach usually results in an inflexible detection system that is unable to detect an attack if the sequence of events is even slightly different from the predefined profile. The problem may lie in the fact that the intruder is an intelligent and flexible agent while the rulebased IDSs obey fixed rules. This problem can be tackled by the application of soft computing techniques in IDSs. Soft computing is a general term for describing a set of optimization and processing techniques that are tolerant of imprecision and uncertainty. The principal constituents of soft computing techniques are Fuzzy Logic (FL), Artificial Neural Networks (ANNs), Probabilistic Reasoning (PR), and Genetic Algorithms (GAs).

There are two general categories of attacks which intrusion detection technologies attempt to identify - anomaly detection and misuse detection. Anomaly detection identifies activities that vary from established patterns for users, or groups of users. Anomaly detection typically involves the creation of knowledge bases that contain the profiles of the monitored activities. The second general approach to intrusion detection is misuse detection. This technique involves the comparison of a user's activities with the known behaviors of attackers attempting to penetrate a system. While anomaly detection typically utilizes threshold monitoring to indicate when a certain established metric has been reached, misuse detection techniques frequently utilize a rule-based approach. When applied to misuse detection, the rules become scenarios for network attacks. The intrusion detection mechanism identifies a potential attack if a user's activities are found to be consistent with the established rules. The use of comprehensive rules is critical in the application of expert systems for intrusion detection.
There are four major categories of networking attacks. Every attack on a network can be placed into one of these groupings.

Data Mining

As on discovering the web usage patterns of websites from the server log files. The Web is a huge, explosive, diverse, dynamic and mostly unstructured data repository, which supplies incredible amount of information, and also raises the complexity of how to deal with the information from the different perspectives of view, users, web service providers, business analysts. The users want to have the effective search tools to find relevant information easily and precisely. The Web service providers want to find the way to predict the users’ behaviors and personalize information to reduce the traffic load and design the Website suited for the different group of users. The business analysts want to have tools to learn the user/consumers’ needs. All of them are expecting tools or techniques to help them satisfy their demands and/or solve the problems encountered on the Web. Therefore, Web mining becomes a popular active area and is taken as the research topic for this investigation. Web Usage Mining is the application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Usage data captures the   identity or origin of Web users along with their browsing behavior at a Web site.
Web usage mining itself can be classified further depending on the kind of usage data considered. They are web server data, application server data and application level data. Web server data correspond to the user logs that are collected at Web server. Some of the typical data collected at a Web server include IP addresses, page references, and access time of the users and is the main input to the present Research. This Research work concentrates on web usage mining and in particular focuse

ANGEL: Enhancing the Utility of Generalization for Privacy Preserving Publication

Generalization is a well-known method for privacy reserving data publication. Despite its vast popularity, it has several drawbacks such as heavy information loss, difficulty of supporting marginal publication, and so on. To overcome these drawbacks, we develop ANGEL,1 a new anonymization technique that is as effective as generalization in privacy protection, but is able to retain significantly more information in the microdata. ANGEL is applicable to any monotonic principles (e.g., l-diversity, t-closeness, etc.), with its superiority (in correlation preservation) especially obvious when tight privacy control must be enforced. We show that ANGEL lends itself elegantly to the hard problem of marginal publication. In particular, unlike generalization that can release only restricted marginals, our technique can be easily used to publish any marginals with strong privacy guarantees.

Congestion-Aware Routing

SENSOR network deployments may include hundreds or thousands of nodes. Since deploying such large-scale networks has a high cost, it is increasingly likely that sensors will be shared by multiple applications and gather various types of data: temperature, the presence of lethal chemical gases, audio and/or video feeds, etc. Therefore, data generated in a sensor network may not all be equally important. With large deployment sizes, congestion becomes an important problem. Congestion may lead to indiscriminate dropping of data (i.e., high-priority (HP) packets may be dropped while low-priority (LP) packets are delivered). It also results in an increase in energy consumption to route packets that will be dropped downstream as links become saturated. As nodes along optimal routes are depleted of energy, only nonoptimal routes remain, further compounding the problem. To ensure that data with higher priority is received in the presence of congestion due to LP packets, differentiated service must be provided. In this work, we are interested in congestion that results from excessive competition for the wireless medium. Existing schemes detect congestion while considering all data to be equally important. We characterize congestion as the degradation of service to HP data due to competing LP traffic. In this case, congestion detection is reduced to identifying competition for medium access between HP and LP traffic. Congestion becomes worse when a particular area is generating data at a high rate. This may occur in deployments in which sensors in one area of interest are requested to gather and transmit data at a higher rate than others (similar to bursty converge cast [25]). In this case, routing dynamics can lead to congestion on specific paths. These paths are usually close to each other, which lead to an entire zone in the network facing congestion. We refer to this zone, essentially an extended hotspot, as the congestion zone (Conzone). In this paper, we examine data delivery issues in the presence of congestion. We propose the use of data prioritization and a differentiated routing protocol and/or a prioritized medium access scheme to mitigate its effects on HP traffic. We strive for a solution that accommodates both LP and HP traffic when the network is static or near static and enables fast recovery of LP traffic in networks with mobile HP data sources. Our solution uses a differentiated routing approach to effectively separate HP traffic from LP traffic in the sensor network. HP traffic has exclusive use of nodes along its shortest path to the sink, whereas LP traffic is routed over un-congested nodes in the network but may traverse longer paths. Our contributions in this work are listed as follows:

Design of Congestion-Aware Routing (CAR):
 CAR   is a network-layer solution to provide differentiated service in congested sensor networks. CAR also prevents severe degradation of service to LP data by utilizing un congested parts of the network.


Modules:
1 Network Formation
2 Conzone Discovery

3 Routing Data via Differentiated paths

predicting report

The primary task of association mining is to detect frequently co-occurring groups of items in transactional databases. The intention is to use this knowledge for prediction purposes: if bread, butter, and milk often appear in the same transactions, then the presence of butter and milk in a shopping cart suggests that the customer may also buy bread. More generally, knowing which items a shopping cart contains, we want to predict other items that the customer is likely to add before proceeding to the checkout counter. This paradigm can be exploited in diverse applications. For example, in the domain discussed in each “shopping cart” contained a set of hyperlinks pointing to a Web page in medical applications, the shopping cart may contain a patient’s symptoms, results of lab tests, and diagnoses; in a financial domain, the cart may contain companies held in the same portfolio; and Bollmann-Sdorra et al. proposed a framework that employs frequent itemsets in the field of information retrieval.

In all these databases, prediction of unknown items can play a very important role. For instance, a patient’s symptoms are rarely due to a single cause; two or more diseases usually conspire to make the person sick. Having identified one, the physician tends to focus on how to treat this single disorder, ignoring others that can meanwhile deteriorate the patient’s condition. Such unintentional neglect can be prevented by subjecting the patient to all possible lab tests. However, the number of tests one can undergo is limited by such practical factors as time, costs, and the patient’s discomfort.